Introduction to Apple Intelligence
Technology is consistently entertaining new crazes. Some examples include blockchain, subscription juicers, netbooks, 3D televisions, hyperloop, and "hoverboards", just to name a handful of examples. All of these were going to be "the next big thing", but none of these have panned out as the inventors intended.
There has been a term bandied about that people think may be the end-all for computers. Said term is "Artificial Intelligence", or "AI". The term "AI" can mean a variety of different things, depending on whom you ask. However, when most use the term AI, what they are expecting is a fully conscious and sentient entity that can think, act, and rationalize as a human would. This is called "Artificial General Intelligence". Today's technology is nowhere even close to being able to come to this reality. It is not yet known whether or not Artificial Intelligence will actually live up to its ultimate expectations.
The term "Artificial Intelligence" can garner a number of thoughts, and depending on who you ask, these can range from intrigue, worry, elation, or even skepticism. Humans have long wanted to create a machine that can think like a human, and this has been depicted in media for a long time. Frankenstein is an example where a machine is made into a human and then is able to come to life . Another great example is Rosie from the 1960s cartoon The Jetsons. In case you are not aware, The Jetsons is a fictional animated tv show that depicts the far future where there are flying cars, and one of the characters, Rosie, is an robot that can perform many household tasks, like cleaning and cooking.
We, as a society, have come a long way to creating modern "artificial intelligence", but we are still nowhere close to creating a robot that is anywhere close to human. Today's modern artificial intelligence falls into a number of categories, in terms of its capabilities, but it is still a long way off from being the idealistic depiction that many expect artificial intelligence to be.
Artificial Intelligence comes in a variety of forms. This includes automated cleaning robots, automated driving, text generation, image generation, and even code completion. There are many companies that are attempting to create mainstream artificial intelligence, but nobody has done so that we know of.
Apple is one of those companies, but they are taking a different approach with their service called Apple Intelligence. Apple Intelligence is Apple's take on artificial intelligence. Apple Intelligence differs in a number of ways from standard "artificial intelligence". This includes the use of on-device models, private cloud computing, and personal context. Before we delve into each of those, let us look at artificial intelligence, including a history.
Artificial Intelligence
Artificial intelligence is not a new concept. You may think that it is a modern thing, but in fact, it harkens back to World War II and Alan Turing. Turing is known for creating a machine that could crack the German Enigma codes. In 1950, Turing released a paper which was the basis of what is known as the "Turing Test". The Turing Test is one where a machine is able to exhibit intelligent behavior that is indistinguishable from a human.
There have been a number of enhancements to artificial intelligence in recent years, and many of the concepts that have been used for a while have come into more common usage. Before we dive into some aspects of artificial intelligence, let us look at how humans learn.
How Human Brains Operate
In order to be able to attempt to recreate the human brain in a robot, we first need to understand how a human brain works. While we have progressed significantly in this, we are still extremely far from fully understanding how a human brain functions, let alone even attempting to control one.
Even though we do not know everything about the brain, there is quite a bit of information that we do know. Human brains are great at spotting patterns, and the way that this is done is by taking in large amounts of data, parsing that data, and then identifying a pattern. A great example of this is when people look at clouds. Clouds come in a variety of shapes and sizes, and many people attempt to find recognizable objects within the clouds. Someone is able to accomplish this by taking their existing knowledge, looking at the cloud, determining if there is a pattern, and if there is one, identifying the object.
When a human brain is attempting to identify an object, what it is doing is going through all of the objects (animals, plants, people, shapes, etc.) that they are aware of, quickly filtering them, and seeing if there is a match.
The human brain is a giant set of chemical and electrical synapses that connect to produce consciousness. The brain is commonly called a neural network due to the network of neural pathways. According to researchers, humans are able to update their knowledge. In a technical sense, what is happening is that the weights of the synaptic connections that are the basis of our neural network brain are updated. As we go through life, our previous experiences will shape our approach to things. Beyond this, it can also affect how we feel about things in a given moment, again, based upon our previous experiences.
This approach is similar to how artificial intelligence operates. Let us look at that next.
How Artificial Intelligence Works
The current way that artificial intelligence works is by allowing you to specify an input, or prompt, and having the model create an output. The output can be text, images, speech, or even just a decision. All artificial intelligence is based on what is called a Neural Network.
A Neural Network is a machine learning algorithm that is designed to make a decision. The manner in which this is done is by processing data through various nodes. Nodes generally belong to a single layer, and for each neural network, there are at least two layers: an input layer and an output layer.
Each node within a neural network is composed of three different things: weights, thresholds (also called a bias), and an output. Data goes into the node, the weights and thresholds are applied, and an output is created. A node requires the ability to actually come to a determination and is based on training, or what a human might call, knowledge.
Training
Humans have a variety of ways of learning something that can include family, friends, media, books, TV shows, audio, and just exploring. Neural Networks cannot be trained this way. Instead, neural networks need to be given a ton of data in order to be able to learn.
Each node within a neural network provides an output, sending that to another node, which provides its output, and the process continues until a result is determined. Each time that a result is determined, a positive or negative correlation is determined. Much like a human, the more positive connections that are made, the better, and eventually, the positive correlation between an answer and the result will push away the negative connections. Once it has made enough positive correlations (gotten the right answer), it will eventually be trained.
There are actually two types of training: Supervised Learning and Reinforcement Learning.
Supervised Learning is the idea of feeding a training model so that it can learn the rules and provide the proper output. Typically, this is done using two methods: either classification or regression. Classification is pretty simple to understand. Let us say that you have 1000 pictures, 500 dogs, and 500 cats. You provide the training model with each photo individually and you tell it the type of pet for each image.
Reinforcement learning is similar, but different. In this scenario, let us say you have the same 1000 pictures, again 500 dogs and 500 cats. But instead of telling the model what is what, you let it determine the similarities between the items and as it continues to get them right, that will reinforce what it already knows.
Inference
Inference, in reference to artificial intelligence, is the process of applying a training model to a set of data. The best way to test a model is to provide it with brand-new data to try and infer the result with this brand-new data.
Artificial Intelligence works by taking the input of the new data, applying the weights, also known as parameters, that are stored in the model and applying them to the actual data.
Inference is not free, it does have a cost, most particularly when it comes to energy usage. This is where optimizations can be useful. As an example, Apple will utilize the Neural Engine as much as possible for its on-device inference. The reason for this is because the Neural Engine is optimized to perform inference tasks, while minimizing the amount of energy needed.
Artificial Intelligence Use Cases
No tool is inherently good or inherently bad, the tool is the tool. It is how it is used that determines whether it is a positive usage or a negative use. Artificial Intelligence is no different in this. Artificial intelligence can have a wide range of possible use cases. Current artificial intelligence is capable of performing actions related to detecting cancer, synthesizing new drugs, detecting brain signals in amputees, and much more. These are all health-related, but that is where many artificial intelligence models are thriving, at least at the moment, but that is not all that is possible.
Not all artificial intelligence usage is positive. There are many who will want to make what are called "Deep Fakes". A deep fake is a way of taking someone and either placing them in a situation where they never were, or even making them say something that they have never said. This is not new, not by a long shot. Since the inception of photos, there have always been manipulations. This is designed to influence someone into thinking a particular way. As you might guess, this can have detrimental effects because it distorts reality. While there are those who want to use these for nefarious purposes, there can be some positive use cases for this type of technology.
Back in 2013, country music artist Randy Travis suffered a stroke and, as a result, now suffers from aphasia, which, according to the Mayo Clinic, is "a disorder that affects how you communicate." This effectively left him unable to perform. However, in May of 2024, a brand-new Randy Travis song was released using artificial intelligence that used two proprietary AI models to help create the song. This was done with full permission from Randy Travis himself, so there is no issue there.
Let us look at a couple of different approaches used, including Large Language Models and Image Generators.
Large Language Models
Large language models, or LLMs, are those that are able to generate language that a human would understand. To quote IBM:
"In a nutshell, LLMs are designed to understand and generate text like a human, in addition to other forms of content, based on the vast amount of data used to train them. They have the ability to infer from context, generate coherent and contextually relevant responses, translate to languages other than English, summarize text, answer questions (general conversation and FAQs), and even assist in creative writing or code generation tasks." - Source: IBM.
LLMs can be used for generating, rewriting, or even changing the tone of text. The reason that this is possible is because most languages have pretty rigid rules, and it is not a complex task to calculate the probability of what the next word would be in a sentence.
The way that an LLM is trained is by consuming vast amounts of text. It then recognizes patterns from this data and then it can generate text based upon what it has learned.
Image Generation
One of the uses of modern artificial intelligence is the ability to create images. Similar to LLMs, there are image generation models that have been trained on a massive number of images. This data has been used to train the models which are used for the actual image generation. Depending on the model, you may be able to generate various types of images, ranging from cartoons to completely realistic ones.
Image generation models use a technique called Generative Adversarial Networks, or GANs. The way that a GAN works is using two different algorithms, the generator, and the discriminator, that work in tandem. The generator will output a bunch of random pixels as an image and then send it over to the discriminator. The discriminator, which has knowledge of millions of pictures of what you are trying to generate, will provide a result, which is basically a "Yes" or "No". If it is a 'no', then the generator will try again and again.
This back and forth is what is called an "adversarial loop" and this loop continues until the generator is able to generate something that the discriminator will say matches the intended type of image.
The training for GANs is quite interesting. It starts with an image and then purposely introduces noise into the image, and it does so again, and again, and again. This process reiterates a large number of times. This noisy data is what becomes the basis for the generator.
All of this is a good base for looking at what Apple has in store for its own artificial intelligence technologies, so let us look at that now.
Apple and Artificial Intelligence
You might think that Apple is late to the artificial intelligence realm, but in fact, Apple has been working with artificial intelligence for many years; it has just been called something else. Some of the areas where Apple has been using artificial intelligence have been with Photos, Siri, Messages, and even auto-correct.
Apple Intelligence
As mentioned above, Apple Intelligence is Apple's take on artificial intelligence. Apple Intelligence differs from standard artificial intelligence in that Apple intelligence is designed to work on YOUR information, not on general knowledge. The primary benefit of working on your data is that your data can remain private. This is done using on-device models.
On-Device Requests
A vast majority of Apple Intelligence requests will be performed on your device. There are a number of examples of this, including things like:
- "Find me pictures of [someone] while in London."
- "When is Mom's flight landing?"
Apple has been doing a lot of research with machine learning to be able to run on-device. This has meant that the machine learning models have needed to be kept the same in terms of quality but need to be able to be used on devices with limited amounts of memory. Limited, of course, is relative. We are not talking like 1GB of RAM, but more like 8GB.
The reason that Apple wants to be able to do much of the processing on your device is twofold. The first is response time. By having devices handle requests, they can be almost instantaneous. This is quite beneficial for those times when you may not have connectivity. Beyond this, sending all of your requests to the cloud would end up providing some sort of delay, even with a direct connection and incredibly fast connection speeds.
The second reason is privacy. Privacy is a big part of Apple's core beliefs. When using your own device and processing the request on the device, that means that nobody else will get access to your data, not even Apple. Instead, only you will have access to your data, which is great for your own peace of mind.
Even though as much as possible will be done on your own devices, there may be instances when your device is not able to handle your request locally. Instead, it may need to be sent to the cloud. This can be needed for larger models that require additional memory or processing to be done. If this is needed, it is handled automatically by sending it to Apple's Private Cloud Compute platform. Let us look at that next.
Private Cloud Compute
Nobody wants their data to get out of their control, yet it does happen from time to time. Apple takes data privacy seriously and has done a lot to help keep people's data private. This is in contrast to other artificial intelligence companies, who have no compunction to take user data and use it to train their machine learning models.
Apple has been working on reducing the size and memory requirements for many machine learning models. They have accomplished quite a bit, but right now there are some machine learning models that require more tokens, which means more memory, than devices are capable of having. In these instances, it may be necessary to use the cloud to handle requests.
Apple has 1.2 billion users, and while not all of the users will utilize Apple Intelligence immediately, Apple still needs to scale up Apple Intelligence to support all users who will be using it. In order to make this happen, Apple could just order as many servers as they want, plug them in, and make it all work. However, that has its own set of tradeoffs. Instead, Apple has opted to utilize their own hardware, create their own servers, and make things as seamless as possible for the end user, all while protecting user data.
Private Cloud Compute is what powers online requests for Apple Intelligence. Private Cloud Compute runs in Apple's own data centers. Private Cloud Compute is powered by a series of nodes. Each of these nodes uses Apple Silicon to process requests. These are not just standard Macs; they have been heavily customized.
Nodes
Each Private Cloud Compute node undergoes significant quality checks in order to maintain integrity. Before the node is sealed and its tamper switch activated, each component undergoes a high-resolution scan to make sure that it has not been modified. After the node has been shipped and arrives at an Apple data center, it undergoes another verification to make sure it still remains untouched. This process is handled by multiple teams and overseen by a third party who is not affiliated with Apple. Once verification has been completed, the node is deployed, and a certificate is issued for the keys embedded in the Secure Enclave. Once the certificate has been created, it can be used.
Request Routing
Protecting the node is just the first step in securing user data. In order to protect user data, Apple uses what is called "target diffusion". This is a process of making sure that a user's request cannot be sent to a specific node based on the user or its content.
Target diffusion begins with the metadata of the request. This information strips out user-specific data as well as the source device. The metadata is used by the load balancers to route the request to the appropriate model. In order to limit what is called a "replay attack", each request has a single-use credential which is used to authorize requests without tying it to a specific user.
All requests are routed through an Oblivious HTTP, or OHTTP, relay, managed by a third-party provider, which hides the device's source IP address well before it ever reaches the Private Cloud Compute node. This is similar to how Private Relay works, where the actual destination server never knows your true IP address. In order to steer a request based on source IP, both Apple's Load Balancer as well as the HTTP relay would need to be compromised; while possible, it is unlikely.
User Requests
When a user's device makes a request, it is not sent to the entire Private Cloud Compute service as a whole; instead, pieces of the request are routed to different nodes by the load balancer. The response that is sent back to the user's device will specify the individual nodes that should be ready to handle the inference request.
When the load balancer selects which nodes to use, an auditable trail is created. This is to protect against an attack where an attacker compromises a node and manages to obtain complete control of the load balancer.
Transparency
When it comes to privacy, one could say, with confidence, that Apple does what they say they are doing. However, in order to provide some transparency and verification, Apple is allowing security researchers the ability to inspect software images. This is beyond what any other cloud company is doing.
In order to make sure there is transparency, each production build of Apple's Private Cloud Compute software will be appended to a write-only log. This will allow verification that the software that is being used is exactly what it claims to be. Apple is taking some additional steps. From Apple's post on Private Cloud Compute:
Our commitment to verifiable transparency includes:
1. Publishing the measurements of all code running on PCC in an append-only and cryptographically tamper-proof transparency log.
2. Making the log and associated binary software images publicly available for inspection and validation by privacy and security experts.
3. Publishing and maintaining an official set of tools for researchers analyzing PCC node software.
4. Rewarding important research findings through the Apple Security Bounty program.
This means that should an issue be found, Apple will be notified before it can become an issue, take actions to remedy the issue, and release new software, all in an attempt to keep user data private.
Privacy
When a request is sent to Apple's Private Cloud Compute, only your device and the server can communicate. Your data is sent to the server, processed, and returned to you. After the request is complete, the memory on the server is wiped so your data cannot be retrieved. This includes wiping the cryptographic keys on the data volume. Upon reboot, these keys are regenerated and never stored. The result of this is that no data can be retrieved because the cryptographic keys are sufficiently random that they could never be regenerated.
Apple has gone to extensive lengths to make sure that nobody's data can be compromised. This includes removing remote access features for administration, high-resolution scanning of the Private Cloud Compute node before it is sealed, and making sure that requests cannot be routed to specific nodes, which may allow someone to compromise data. Beyond this, when a Private Cloud Compute node is rebooted, the cryptographic keys that run the server are completely regenerated, so any previous data is no longer readable.
For even more detail, be sure to check out Apple's blog post called "Private Cloud Compute" available at https://security.apple.com/blog/private-cloud-compute.
General World Knowledge
Apple Intelligence is designed to work on your private data, but there may be times when you need to go beyond your own data and use general world knowledge. This could be something like asking for a recipe for some ingredients you have, or it could be a historical fact, or even to confirm some existing data.
Apple Intelligence is not capable of handling these types of requests. Instead, you will be prompted to send these types of requests off to third parties, like OpenAI's ChatGPT. When you are prompted to use one of these, you will need to confirm that you want to send your request and that your private information (for that specific request) will be sent to the third party.
At launch, only OpenAI's ChatGPT will be available. However, there will be more third-party options coming in the future. This type of arrangement is a good escape valve should you need to get some information that is not within your own private data. Now that we have covered what Private Cloud Compute is, let us look at what it will take to run Apple Intelligence.
Minimum Requirements
Apple Intelligence does require a minimum set of requirements in order to be used. Apple Intelligence will work on the following devices:
- iPhone 16 Pro/Pro Max (A18 Pro)
- iPhone 16/16 Plus (A18)
- iPhone 15 Pro/Pro Max (A17 Pro)
- iPad mini (A17 Pro)
- iPad Pro (M1 and later)
- iPad Air (M1 and later)
- MacBook Air (M1 and later)
- MacBook Pro (M1 and later)
- Mac mini (M1 and later)
- Mac Studio (M1 Max and later)
- Mac Pro (M2 Ultra and later)
There are a couple of reasons why these are the devices that can be used. The first is that they require a neural engine. For the Mac, this was not present until 2020 when the first Macs with Apple Silicon were released. For the iPhone, the first Neural Engine appeared with the A11 Bionic chip on the iPhone 8, 8 Plus, and iPhone X. All iPhones since have included a Neural Engine, but that is just one requirement.
The second requirement is the amount of memory. The minimum amount of memory to run the on-device models is 8 gigabytes. The iPhone 15 Pro and iPhone 15 Pro Max are the first iPhones to come with 8GB of memory. All M1 Macs have had at least 8GB of memory.
Now, this is the minimum amount of memory. Not all features will work with only 8GB of memory. One example is a new feature for developers within Apple's Xcode app. With Xcode 16, developers will have the option of using Apple's Predictive Code Completion Model. When you install Xcode 16, there is an option that allows you to download the Predictive Code completion model, but only if your Mac has 16GB of memory or more. To illustrate this, if you have a Mac mini with 8GB of memory, you will get the following installation screen.
Similarly, let us say you have a MacBook Pro with 32GB of unified memory, you will get this installation screen.
As you can see, the Predictive Code Completion checkbox is not even an option on the Mac mini with 8GB of memory. And the Predictive Code Completion is a pretty limited amount of knowledge. Swift, while being a large programming language, is limited in scope, and that model does not work on 8GB.
It would not be presumptuous to think that this may be the case for various Apple Intelligence models going forward. Now that we have covered the minimum requirements, let us look at some of the use cases that Apple Intelligence can handle, starting with something called Genmoji.
Enabling Apple Intelligence
As outlined above, Apple Intelligence is available for compatible devices running iOS 18.1, iPadOS 18.1, or macOS Sequoia 15.1. However, Apple Intelligence is not automatically enabled. Instead, you will need to enable it. Apple Intelligence is activated on a per Apple Account basis. This only needs to be done once. Once activated, it will need to be enabled per device. To activate Apple Intelligence perform these steps:
- Open Settings on iOS, or iPadOS, or System Settings on macOS Sequoia.
- Scroll down to "Apple Intelligence".
- Tap, or click, on "Apple Intelligence" to bring up the settings.
- Tap, or click, on "Join Apple Intelligence Waitlist". A popup will appear
- Tap on the "Join Apple Intelligence Waitlist" button to confirm you want to join the waitlist.
Once you do this, you will join the Apple Intelligence waitlist. It may take some time before you are able to access the features. Once your Apple Account has had Apple Intelligence activated on it, you will then get a notification on your device indicating that Apple Intelligence is ready.
At this point, you can click on the "Turn On Apple Intelligence" button, and a popup will appear that will allow you to enable the features. Once you have enabled Apple Intelligence on your device, you will be able to use the features.
Closing Thoughts on Apple Intelligence
Many Artificial Intelligence tools require sending your private data to a server in the cloud to be able to perform a particular task. Doing this has the potential to not only leak your private data, but your private data can possibly be used to train additional artificial intelligence models. This is an antithesis to the core values of Apple, so Apple has taken a different approach with their own artificial intelligence that they are calling Apple Intelligence.
Apple Intelligence is designed to work on your private data and maintain that privacy. The way that this is accomplished is through a service called Private Cloud Compute. Private Cloud Compute is a set of servers in Apple's own datacenter that are built on Apple Silicon, utilizing features like the Secure Enclave to maintain the integrity of the server. Beyond this, each time that a request has been completed, the previous keys are wiped, and the server is completely reset and reinitialized with no data being retained between reboots.
Apple Intelligence is designed to help you accomplish tasks that you need, like summarizing text, generating new emojis, creating images, and more.
Apple Intelligence will be a beta feature starting in late 2024, with some overall features not coming until 2025, and it will be English only at first. Furthermore, these features will not be available in the European Union, at least not at first.
Apple Intelligence will have some pretty stiff requirements, so it will not work on all devices. In fact, you will need to have an Apple Silicon Mac or an iPad with an M1 or newer, or an A17 Pro. For the iPhone, you will need a device with an A17 Pro, A18, or A18 Pro. These devices are the iPhone 15 Pro, iPhone 16/16 Plus, or iPhone 16 Pro/Pro Max to take advantage of the Apple Intelligence features.
This is merely an introduction to Apple Intelligence, There will be more articles in this series, so be sure to check out those articles.