How Pixel 10 Pro created the world's smartest phone camera - a peek inside Google

Google Pixel 10 Pro camera — ZDNET’s Kerry Wan takes a photograph with the Google Pixel 10 Professional digicam.
Sabrina Ortiz/ZDNET

Isaac Reynolds has been engaged on the Pixel Digicam group at Google for nearly a decade — for the reason that first Google Pixel phone launched in 2016. And but, I feel it is honest to say that he is by no means been extra bullish in regards to the know-how that Google has built-in right into a telephone digicam than he’s with this yr’s Pixel 10 Professional. A brand new wave of AI breakthroughs prior to now yr have allowed Google to make use of Massive Language Fashions, machine studying, and generative AI imaging to unlock new capabilities to energy one other significant leap ahead in telephone images.

I bought the possibility to sit down down with Reynolds as he was nonetheless catching his breath from the launch of the Pixel 10 telephones — and on the similar time, ramping up for the subsequent set of digicam upgrades the group is making ready for the 2026 Pixel telephones.

Additionally: Pixel just zoomed ahead of iPhone in the camera photography race

I peppered Reynolds with all of my burning questions on Professional Res Zoom, Conversational Enhancing, Digicam Coach, AI fashions, the Tensor G5 chip, Auto Greatest Take and the bigger ambitions of the Pixel Digicam group. On the similar time, he challenged me with info I did not anticipate on Telephoto Panoramas, C2PA AI metadata, Guided Body, and educating the general public about AI.

I bought to unpack lots about how the Google group was capable of engineer such huge advances within the Pixel 10 Professional digicam system, and we delved far deeper into the brand new images options than Google talked about in its 2025 Made by Google event or in its published blog post.

Here is my reporter’s pocket book on what I discovered.

Mission of the Pixel Digicam group

“I feel the most important factor our group has all the time been centered on is what I name sturdy [photography] issues — low mild, zoom, dynamic vary, and element,” stated Reynolds. “And each technology [of Pixel] has introduced new applied sciences.”

Digicam Coach

Reynolds famous, “LLMs have such an unlimited context window, they usually’re so highly effective at understanding that we are able to truly educate folks to do issues that tech cannot do.

“At the moment, tech can’t transfer the digicam down 4 ft. Tech cannot stroll the digicam over 100 yards to the higher viewpoint. It could possibly’t inform you to show 90 levels. Now, Digicam Coach can try this sort of stuff. In order that’s simply one other means we’re utilizing know-how to unravel a few of these sturdy issues.”

Conversational enhancing

One of the vital stunning new options Google introduced within the Pixel 10 was conversational picture enhancing — though that is technically a feature in the Google Photos app. This allows you to merely describe what you need modified within the picture, with voice or typing, and the AI takes care of the remainder. So you may take away a tree, re-center the picture, or add extra clouds to the sky, for instance.

Conversational editing in Google Photos — Conversational enhancing in Google Pictures.
Google

As Reynolds defined it, “Conversational enhancing simply takes the entire interface away and it is primarily a mapping operate from pure language to the issues that have been within the editor. So you may say, ‘Erase the factor on the left,’ and it’ll simply determine what the factor on the left is after which invoke Magic Eraser. You possibly can say, ‘Hey, once I was in Utah I bear in mind the rocks being extra pink than that’ and it simply will increase the heat slightly bit. You possibly can say, ‘Are you able to concentrate on the factor within the middle’ and it places slightly vignette round it.

“And that mapping is a large time saver. The promise of AI was not simply that it might be informational, however it was that it might take actions for you. And I feel this is without doubt one of the most excellent circumstances of the AI not simply reminding you of one thing … however doing it for you. It has been actually, actually cool to see how efficient it’s.

“It even offers you ideas. The AI will take a look at an image and say ‘I feel you’ve gotten some bystanders you’d need to take away.’ And so it populates these little suggestion chips. The funniest a part of the suggestion chips is if you faucet them, all it does is kind into the textual content field. It is not a separate pathway. You simply faucet the chip and it sticks one thing within the textual content field. You possibly can have written that your self. It is not doing something wildly completely different than you would do… It is also bought the voice button, which is tremendous cool. You possibly can simply discuss to it if you wish to. The AI is getting so good a lot sooner than I may think about, and I am an expert on this house.”

Professional Res Zoom

As a photographer who loves zoom images, this was the characteristic I wished to speak with Reynolds about probably the most. I take loads of images with smartphones, however lengthy distance zooms are the place I most frequently want to tug out my Sony mirrorless digicam and 70-200mm lens. I’ve already written about how excited I am to thoroughly test Pro Res Zoom, because it may assist produce much more usable zoom images from a telephone by utilizing generative AI to fill within the gaps in digital zoom.

Reynolds commented, “The basic drawback is, how do I flip a digital zoom the place you’ve got bought a sensor pixel on the far proper nook, after which one other one on the underside left nook. And you must fill in all of the pixels in between. You are able to do an interpolation. You possibly can simply set all of them to be some colour, like simply common them. We have grown all through the method right here. We have gone by way of multi-frame denoise. We have gone by way of a number of completely different generations of upscalers to make higher interpolations. We went to a multi-frame merge that was block-by-block. After which the most important development that was Tremendous Res Zoom was going from a block-by-block multi-frame to a probabilistic pixel-by-pixel multi-frame… In parallel, the upscalers have been enhancing. And the newest technology upscaler is the biggest mannequin we have ever run in Pixel Digicam ever… And it is only a actually, actually good interpolator.

“It does not simply say that is black and that is white, and so the center is grey. It is like, properly, I do know that that black pixel is an element of a bigger construction. I do know that that bigger construction seems to be the grout in between some brick on a facade. And so most likely it may be black up till that time, after which it may flip pink — which is a lot smarter than simply going, ‘Effectively, it is black and it is pink. So, I do not know. I assume we’ll simply combine them as we go throughout.’ So we nonetheless have these actual issues as actual pixels, after which we’ve got to fill in what’s in between. And now the fashions are simply so, so good at that.

Pixel 10 Pro Res Zoom at 100x — The highest picture is at 0.5x zoom and the underside is identical framing at 100x on Pixel 10 Professional.
Google (screenshot by Jason Hiner/ZDNET)

“We have had an extended line of upscalers, and that is the newest one. All of the upscalers have artifacts. Completely different upscalers have completely different sorts of issues. We have had upscalers prior to now that have been very, excellent at textual content — as a result of textual content has very harsh strains — however very dangerous at water, as a result of water is essentially chaotic. This upscaler has its personal artifacts, and people artifacts are very troublesome for the human eye to acknowledge, as a result of the brand new fashions are so good at making content material that’s 100% genuine to the scene.

“Like, sure, that is a leaf on a tree. That is precisely what a leaf on a tree seems to be like. It is flawless. However for a human face, there may be a lot of the human mind devoted to recognizing faces, that no stage of artifact is successfully acceptable. The extent of refined artifact on a leaf, you might by no means discover. However the identical subtlety on a face, you discover immediately — simply because we’re human beings and we’re designed to acknowledge different human beings. We’re social creatures, so the bar for truly doing a great job with human faces is awfully excessive.”

Because of this, when Professional Res Zoom acknowledges a human face, it will not use the AI to upscale it.

C2PA metadata to label AI

As a result of Google is now a part of the Coalition for Content Provenance and Authenticity (C2PA), it has began to embed metadata into its images to point whether or not generative AI was used to make the picture by utilizing SynthID, a watermark created by Google DeepMind. Reynolds was deeply concerned with the challenge to make this a part of Pixel Digicam.

“The [C2PA] metadata identifies whether or not this was AI or not, and it simply typically tells you the historical past of the image and we embed it,” stated Reynolds. “I used to be personally the product supervisor for that. I do not do issues personally like that lots anymore, however I did take that one as a result of I knew how vital, nuanced, and refined it was. And the deeper I bought into that characteristic, the extra I noticed how little folks truly learn about what AI is or is not, what it will possibly and might’t do, or how briskly or sluggish it is progressing.”

An example of Google C2PA metadata for AI — An instance of Google C2PA metadata for AI.
Google

Additionally: Google Pixel 10 series hands-on: I did not expect this model to be my favorite

Educating the general public about AI

“The world is truthfully behind when it comes to not realizing how good AI is already. So there’s some training to do. And we notice that AI can do issues that I feel customers would actually, actually like in the event that they understood higher what was happening. So a part of what we do in Professional Res Zoom is we do not contact faces. I feel that’ll make folks extra comfy. We additionally present them the earlier than and after — the model with the brand new upscaler and the one with out it, and also you get to determine for your self, what did AI do? Did I discover it acceptable or unacceptable? The overwhelming majority are discovering it greater than acceptable — extremely most popular, in truth. They need the upscale. However they would not know that in the event that they did not get to see the side-by-side.

“After which we additionally label it with content material credentials [C2PA] in order that each time they transmit that picture, anyone else could make their very own choice about, ‘How do I think about this picture? Do I low cost this as perhaps AI? Or do I’m going, oh no, the content material credentials are proper there. They are saying it is not AI in any respect. That is nice. I’ve a lot extra belief now.’ And as customers be taught extra, as they’re educated extra, as they achieve extra consolation and extra actual world knowledge factors of what’s AI and what is not, I feel they are going to find yourself being extra comfy over time, and that is what we’re seeing with Professional Res Zoom already. The client satisfaction that we measured pre-launch was so good for that characteristic.

“And because the know-how will get higher, we’ll do extra. We’ll put these things into extra modes, maybe. We’ll push the zoom slightly increased high quality. However we actually need to guarantee that we’re doing that as customers anticipate and perceive it. So we’re providing you with choices and selections and transparency, however we’re additionally making an attempt to push the boundaries of know-how in a means that retains buyer satisfaction excessive.”

Telephoto Panoramas

“There are all the time little goodies hidden all around the [camera] app,” Reynolds instructed me. “We construct extra stuff than we are able to realistically discuss.”

One of many new images options within the Pixel 10 Professional that Google hasn’t talked a lot about is Telephoto Panoramas, or what they affectionately name “5x tele-panos.”

These let you take extra cinematic panorama pictures utilizing the zoom lens, new viewfinder controls, and the flexibility to to shoot 360 levels and as much as 100MP decision. “There’s one thing that is simply so good about zooming in along with your lens after which stitching the panorama,” stated Reynolds.

However what Google hasn’t talked about is the truth that it is utilizing a completely new methodology of capturing these panoramic pictures.

“A number of panoramas out there, and ours traditionally as properly, have been video based mostly,” Reynolds famous. “And what meaning is to make a panorama you are taking 100 to 1000 pictures, and every certainly one of them, you sew slightly tiny vertical slice. So meaning two issues. Primary, it signifies that the artifacts you get are typically curves, stretches, and compressions since you’re simply going slice by slice. The opposite drawback is that in that 30 seconds, you must course of [up to] 1000 pictures.

“So what we did is we stated as a substitute of a video we’ll use picture enter. So we’ll take 5 footage, not tons of, and we’ll put all of our processing behind it — full HDR Plus, full computational images, Night time Sight — after which we sew slightly little bit of overlap. So as a substitute of getting slightly sliver from every image, it is just a bit overlap. That is how [Adobe] Lightroom would do it, for instance. We’re utilizing the Lightroom methodology.

“And so we get Night time Sight Panorama. We get panoramas now as much as 100 megapixels. We get simply tremendous, tremendous detailed and we are able to activate elements of the zoom pipeline that we could not essentially do earlier than. So you need to use the 2x zoom, which on a Pixel telephone has optical high quality. And you’ll even invoke the 5x telephoto [on the Pixel Pro]. It is a very computational-photography-forward, photo-based panorama.”

Additionally: Google Pixel 10 Pro vs. iPhone 16 Pro: I’ve tried both flagships, and there’s an easy winner

Guided Body (accessibility characteristic)

One other characteristic that has flown underneath the radar that Reynolds wished to level out was Guided Body.

“Guided Body is an accessibility characteristic. In case you are blind or low-vision, we use Gemini to let you body any picture,” stated Reynolds. “In that case, you level the digicam, you invoke Guided Body, and it says, ‘It is a picture of a scene of the woods with some timber off to the proper and an individual on the left. Individual is in body, smiling, good for a selfie. After which it can take the picture. So if you cannot actually see the display screen that properly, it helps take selfies and images, as a result of [selfies] are how folks talk. Whether or not you are blind or low-vision or not, folks talk utilizing footage. So it offers them that functionality.”

Auto Greatest Take

I additionally requested Reynolds in regards to the evolution of Greatest Take to Auto Greatest Take this yr and was stunned to be taught that this characteristic is definitely utilizing extra machine studying.

“Auto Greatest Take is far more conventional processing,” Reynolds commented. “You possibly can think about this as a call tree, as a result of that is primarily what this characteristic is. You press the shutter as soon as. If that shutter press was excellent and everybody was smiling, everybody’s wanting on the digicam, then nice. Executed. One image.

“Okay, for example it wasn’t excellent. Then we’ll open the shutter slightly longer and we’ll take a look at each single body. In order that’s as much as 150 frames in only a few seconds. If we see one which’s higher, we’ll take it, we’ll save that one, we’ll course of it in full HDR Plus high quality… So if you go to the gallery, you are going to see the one which we took as the first, that is known as High Shot. In order that’s one step down the choice tree.

“As an example we checked out 150 frames and we could not discover one which was excellent, however we discovered one which was virtually excellent, and a second one which was virtually excellent however otherwise, reminiscent of a special face. Then what we’ll do is we’ll save each of these after which we’ll go that to Greatest Take and Greatest Take will mix them into one that’s excellent. And High Shot will deliberately select a spread of images so that there is at the very least one picture by which each face is smiling. So if there’s a image of each face smiling at the very least as soon as someplace within the set then it can do a Greatest Take. When you take a look at 150 footage, more often than not you get the shot. So very not often does it truly go to Greatest Take. So it is slightly odd that we name it Auto Greatest Take, as a result of in actuality, we do not do it fairly often, because it’s on the finish of the choice tree.

“The purpose is that you just press the shutter one time and also you get one picture and that picture is ideal. It doesn’t matter how we get there. We by no means need you to should take three images [of the same group picture] once more. As a result of why would you are taking three random images when [the AI] can take a look at 150 images. So we are saying simply press [the shutter button] as soon as. Give it a few seconds. You may see it within the UI. It attracts packing containers round folks’s faces. It turns them gold when it thinks it nailed it. So press the shutter, give it a pair seconds, after which watch what you get on the finish.”

Google Pixel 10 Pro selfie camera — Google Pixel 10 Professional selfie digicam.
Sabrina Ortiz/ZDNET

The distinction with Tensor G5

Google made a big move in 2025 with its Tensor G5 chip powering the Pixel 10 telephones — shifting from having Samsung construct its Tensor chips prior to now to a TSMC 3nm course of that makes use of TSMC’s superior know-how to extend AI efficiency. I requested Reynolds in regards to the impression.

“[The boost with Tensor G5] is without doubt one of the largest before-and-afters I’ve ever seen when it comes to processing latency,” he famous. “The primary variations of Professional Res Zoom took like two minutes [to process]. After which by the top, as soon as they bought it on Tensor G5 and all of the bugs had been labored out, that bought down to only a number of seconds… So the Tensor G5 TPU is 60% extra highly effective, and we are able to positively see that.”

Additionally: Considering the Pixel 10 Pro? I recommend buying these 5 phones instead – here’s why

The AI fashions powering Pixel images

Since so most of the Pixel 10’s most vital new options are powered by AI advances, I wished to know extra about how the Pixel Digicam group works with Google’s inner AI capabilities.

“It is not like there’s this one monolithic Gemini,” Reynolds stated. “This can be very rigorously tuned and examined for one explicit use case at a time… There are such a lot of extra variations of Gemini inside [Google] than you may see outdoors. After which you must determine, am I going to immediate this Gemini or am I going to nice tune this Gemini? It is all tremendous, tremendous customized to a specific implementation.” For instance, he added, “Magic Eraser is generative, however it’s not Gemini.”

Ultimate thought

Google is the one one among the many dozen or so firms on the earth constructing frontier AI fashions that additionally makes its personal smartphone. And with the Pixel 10 Professional, the impression is beginning to present.

Source link

How Pixel 10 Pro created the world’s smartest phone camera – a peek inside Google

Everything announced at Meta Connect 2024: $299 Quest 3S, Orion AR glasses, and more

Ethereum turns deflationary: What it means for ETH prices in 2025

Ethereum Price Could Still Reclaim $4,000 Based On This Bullish Divergence

Uniswap Launches New Bridge Connecting DEX to Base, World Chain, Arbitrum and Others

Making the case for Litecoin’s breakout before Bitcoin’s halving

Rocket Pool Stands To Reap Big From Ethereum’s Dencun Upgrade, RPL Flying

24 Crypto Terms You Should Know

Shibarium Breaks The Internet (Again) With Over 400 Million Layer-2 Transactions

ตลาดโล่ง! Mt. Gox เลื่อนคืน Bitcoin มูลค่า 4 พันล้านดอลลาร์

More IT leaders are using AI to cut costs – but not in the ways you’d expect, Gartner finds

XRP At $1,000 Is Peanuts If Used To Clear US National Debt; Pundit Explains

First Ethereum Treasury Firm Dumps ETH: Death Spiral Incoming?

Recent News

ตลาดโล่ง! Mt. Gox เลื่อนคืน Bitcoin มูลค่า 4 พันล้านดอลลาร์

More IT leaders are using AI to cut costs – but not in the ways you’d expect, Gartner finds

Categories

Recommended