Vision Pro Gestures: How Hand and Eye Navigation Works on Vision Pro Apple Vision Pro gestures power visionOS navigation through eye tracking and subtle hand movements on Vision Pro.

A person wearing a VR headset sits on a couch in a living room, interacting with floating digital app icons using Vision Pro Gestures, seamlessly blending virtual elements with the real home environment.
Image Credit: Apple Inc.

Vision Pro gestures define how users move through visionOS. Instead of relying on controllers or physical buttons, the system combines precise eye tracking with subtle hand gestures to create a navigation model that feels direct and minimal.

Understanding how visionOS work is essential to getting comfortable inside the spatial interface. Once learned, the system becomes intuitive and fluid.

Vision Pro Gestures: How Eye Tracking Powers Selection

On Vision Pro, your eyes act as the primary pointer. When you look at an icon, button, or interface element, visionOS detects your gaze and highlights that element automatically.

There is no cursor to drag. No touch surface to swipe. Your focus determines the target.

The headset uses infrared cameras and advanced tracking to detect where your eyes are directed. When your gaze rests on an element, it becomes active. That visual response confirms selection before any hand movement is required.

This approach reduces unnecessary hand motion and keeps interactions natural.

A person wearing a VR headset stands in an office, using Vision Pro Gestures to interact with a large, virtual 3D model of a chair. Blue lines and measurements (38", 15", 42") highlight the chair's dimensions.
Image Credit: Apple Inc.

The Pinch: The Core Action

Once you are looking at an item, a simple pinch gesture selects it.

Touch your thumb and index finger together lightly > Release

That small motion functions like a tap on an iPhone screen. It opens apps, presses buttons, and confirms actions.

Because the gesture is subtle, you do not need to lift your arm high or exaggerate movement. Your hands can rest comfortably on your lap or at your side.

This combination — look, then pinch — forms the foundation of Apple Vision gestures.

Scrolling and Navigating Windows

Scrolling in visionOS also uses a natural pinch motion.

Look at a scrollable window > Pinch your thumb and index finger together > Move your hand up or down

The system interprets this as scrolling. The motion is similar to dragging content on a touchscreen, but without physical contact.

For horizontal scrolling, move your hand left or right while maintaining the pinch.

To close a window:

Look at the close button > Pinch

The same simple action applies across system controls, reinforcing muscle memory.

A sleek, black, futuristic virtual reality headset with a smooth, curved design and reflective surface—perfect for exploring the latest Betas—viewed from the front against a white background.
Image Credit: Apple Inc.

Going Home and Accessing the App Grid

VisionOS includes a Digital Crown on the device. While Apple Vision gestures handle most interactions, the Digital Crown offers a consistent way to return to the Home View.

Press the Digital Crown

The app grid appears in your space.

From there:

Look at an app icon > Pinch to open

This consistency ensures that navigation always has a clear reset point.

Resizing and Moving Windows in Space

visionOS allows windows to float and be repositioned within your environment.

Look at the window bar > Pinch and hold > Move your hand

The window follows your movement, letting you reposition it within your physical space.

To resize:

Look at the corner handle of a window > Pinch and drag outward or inward

This expands or shrinks the window proportionally. These gestures make spatial computing feel physical without requiring exaggerated motion.

System-Level Controls

Control Center and system settings are also accessible through gaze and pinch combinations.

Look upward slightly > Control Center indicator appears > Pinch

From there, you can adjust brightness, volume, or environment settings using the same look-and-pinch method. The interaction model remains consistent across apps and system layers.

A person’s hand holding a VR controller uses Vision Pro Gestures to draw a blue line in a mixed reality workspace, with a desk, chair, bicycle, and office supplies visible in the background.
Image Credit: Apple Inc.

Natural Interaction

Traditional VR headsets often rely on handheld controllers. Apple removed that layer entirely. By merging eye tracking with minimal hand gestures, Vision Pro reduces hardware complexity and keeps interactions more direct.

The absence of controllers shifts the learning curve toward understanding eye precision and subtle finger motion. Once comfortable, users often find it faster than traditional pointer-based navigation.

VisionOS is built around this interaction model. Apps designed for Vision Pro follow the same selection and gesture standards, ensuring a unified experience.

Learning Curve and Daily Use

Most users adapt within minutes. The key is to relax hand movements and rely on gaze accuracy. Large gestures are unnecessary; small pinches and slight motions are enough.

As apps become more spatial, Apple Vision gestures will continue shaping how users interact with content. Whether browsing Safari, watching immersive video, or arranging multiple windows in space, eye tracking and pinch controls remain central.

Navigating visionOS becomes second nature when the pattern is clear: look to target, pinch to act, move to adjust.

Jack
About the Author

Jack is a journalist at AppleMagazine, covering technology, digital culture, and the fast changing relationship between people and platforms. With a background in digital media, his work focuses on how emerging technologies shape everyday life, from AI and streaming to social media and consumer tech.