Progressively loading images – HTTP 203

Domain & Hosting bundle deals!

SURMA: So, Jake, what do you
want to talk about today? JAKE: Well, yeah,
that's a good question. [LAUGHTER] I should know, shouldn't I? Oh, these slides
took me a long time, and I don't think it's worth
the payoff, but let's find out. SURMA: I was going to say,
they don't look that good. JAKE: Shut up, mate. [MUSIC PLAYING] Hello, Surma. SURMA: Hello, Jake. JAKE: So I occasionally
get comments on the video, and people think I'm Surma,
and people think you are Jake. So I think we should
sort that out. And you see what I
did there, right? i called you by your name. I don't usually shut
my own name out. But maybe we could
put a little thing along the bottom, which
makes it super clear who is who, because, you know. SURMA: Oh. JAKE: It's important for our
individual brand identity. Anyway, we've talked a lot
about the size of images and how to make them small. But I want to dive a
bit deeper and look at the different styles
of loading an image. When I see a page
like this, it feels like there's something missing,
but I don't always know what.

SURMA: I mean, it's good
on you that the space– that there is a blank space,
because on many pages, we don't even have
a blank space. And you think, oh, this
site is ready to read, and you start reading. And then things
start moving around, and it's really frustrating. JAKE: Oh, yes, the absolute
worst thing that can happen here is a layout shift.

But this site has at least
left a space for something. But is it a photo? Is it an Illustration? Is it a graph? Is it a piece of interactivity? It could be more
than just an image. So a smaller image at this
point will reduce the time that the user is in this
what's-going-on phase. But we can actually do something
while the image is loading as well, and that's
what I wanted to focus on in this episode. You might've seen
some site do this. SURMA: I was going
to say, are you going to talk about BlurHash? Because that had
attention for a while. JAKE: Yes, so this is a BlurHash
or some other form of making it clear there's an image here.

It could just be a couple
of gradients or something like that. And I would say when I
see this, it's clear to me now that there's supposed
to be an image here. It's probably not
a graph, but it could have core content in it. Like, it could have
text over the top. That wouldn't be
obvious from this view. What we're essentially
saying is sometimes, if there's a slow-loading image
like this, a lot of times, I don't care. I'm just going to
read the content. But you only do that once
you know the image is just illustrative of the content. Like, it's not a core piece
of content in and of itself.

You can solve this problem
by throwing more bytes at a preview like this. And once I see this,
I'm like, right, OK. It's a kid with an ice cream. Fair enough. It's supplementary
content, so I could just go ahead and read the article. And in the meantime, the
image can continue to load. And that's a much
faster user experience compared to having
nothing and then the image appearing fully loaded. Meanwhile, the user's just
staring at a blank area wondering what's going on. But there are many ways
to tackle this problem. There are some solutions
which are pretty new, and there are some which
are almost 30 years old. And I'm going to try and cover
all of them in a small video.

I'm going to start
with the old one. I'm going to start with JPEG. SURMA: That's the
30-year-old one, right? JAKE: 29, I think. But there have actually been
innovations in this area, even in the last few months. I want to talk
about those as well. So here's an image that's
pretty hard to compress for whatever reason. I don't know why this is
a really difficult image to compress. I guess it's because there's
lots of sharp lines, lots of different areas of the image,
different things going on. SURMA: I mean, so
technically, it's probably quite easy to compress. It will just [BLEEP]. JAKE: Yes. So I was trying to avoid
that with this image. And it turns out you have
to throw a lot of bytes at it in order for it
to not look terrible. Especially in some
newer image formats, it's harder to
compress this image in newer image formats, which
is unusual, but does happen.

But with JPEG,
they're both 300k. They look pretty good. They're above the same quality. But they're quite different
in how they're encoded. Here's how they
look when they're loaded over a 2G connection. I made it super slow, so
you can see what's going on. Now, I'd say that
the one on the right feels like it loaded faster. Like, you could tell what
the image was a lot sooner.

In fact, it kind of feels
like the one on the right has finished loading, but it hasn't. It's still filling
in tiny details. They take the same
amount of time to load because
they're the same size. SURMA: Right. JAKE: There are
people who don't like the effect on the
right of it loading in multiple scans like that. But I think they're wrong, but– SURMA: I mean, don't
you remember from– I mean, I certainly do
from the dial-up days, where most images– not most, but many images
behave like the one on the left, the progressive– not progressive,
but they're just like with the line-by-line
appearance of the final image.

And then your internet would cut
out, or buffering would happen, or something happened just
when the core part of it was about to happen. Like, there's the character I
was waiting for or something, and then the image
just stops loading. Nothing is more frustrating. JAKE: Exactly. Whereas the one on the
right, which, yeah, is a progressive JPEG, you get
this low-quality version first, and then it comes in and
fills in the tiny details. The one on the left
is not progressive. Some software calls it
baseline, although that's a slightly different thing. But they sometimes use that to
mean not progressive as well. A baseline JPEG
can't be progressive, but it also means it can't
do some other things.

Anyway, Squoosh, the image
compressor that we work on, that creates progressive JPEGs
by default because we like it. We use MozJPEG,
Mozilla's JPEG encoder, although there is
an option in there to create the nonprogressive
kind if you really want for whatever reason. Now, I said there'd
been an innovation here, so I want to take
a look at that. Chrome and Firefox have received
an update to their JPEG decoder recently. So here's what it looks
like compared to Safari, which hasn't.

So 3, 2, 1, go. And it might be difficult to
notice the difference if you're watching this on a
small YouTube window, or for you, Surma,
who's watching it through a teleprompter. But it's really noticeable
at full size for real users. Here it is up close. This is the first
pass in Safari. You get this blocky
nearest-neighbor scaling. Whereas in Chrome
and Firefox, you get this much nicer
smooth appearance. And that's pretty new. Chrome and Firefox were
blocky until recently. SURMA: So it's the blocky part– I think they call that
the DC layer, right? It's literally just the–
because JPEG divides them into 8-by-8 blocks.

Are those just the 8-by-8
blocks with one solid color, basically? JAKE: Yes, that's right. It's the DC part
of the encoding. So it's solid colors, and, yes,
what you're getting in Chrome or Firefox is just an
interpolated version of that. SURMA: Ah. That's clever. JAKE: So yeah, most
progressive JPEGs will have the first phase just
being the DC information for– SURMA: So they're both
basically the same data, just that Chrome and
Firefox are applying a post-processing effect
on the decoder side to make it look more
pleasant while more data is being loaded. JAKE: Yes, this is literally
the same image loading in each browser
at the same speed. That's what you saw there. I just– SURMA: Interesting.

JAKE: –did a screen
capture of it. Yeah, so Chrome and Firefox use
libjpeg-turbo as their decoder, whereas Safari
uses a custom one, I think, one that Apple made. So it doesn't have this benefit
yet, but they might add it. Also, Safari's
decoder sometimes just doesn't decode progressive
images in a progressive way. It just waits until the
whole image is there. And I don't know why it happens
with some images and not others. I don't know. I guess it's some bugs. But we also have
bugs in our decoder, and I found some bugs
just making the slides. One of the reasons it took me
so long to get these slides together is I started
finding bugs in our decoder while making it, so then had
to go and file those bugs.

We sometimes fall back
to this blocky effect too early where we
shouldn't, and that's something we need to fix. But JPEG is actually
really amazing as a format because you can script– when you're encoding
an image, you can script what the
progressive passage should be, how much detail
each pass should have, what detail it
should be missing, and how many passes it
should have in total.

It's really flexible. We might actually ship
our own in Squoosh. That makes– SURMA: Yeah, I saw you
had a draft PR open, and I saw you were
moving around some C code from the MozJPEG encoder. And I never realized– and
I ported the MozJPEG encoder originally, and I
never realized that we had detailed control over
the individual passes, as they are called,
of the JPEG decoding. That's really, really cool. JAKE: Yeah, so what
most JPEG will do is that it does the DC pass,
like those 8-by-8 blocks. And then it, as
soon as possible, gives you a low-quality version
but with a bit more data.

And that's actually not
great with the new decoders that Firefox and
Chrome are using, because it throws them into a
more blocky output much earlier than we otherwise would. So what I'm trying to do
with the one in Squoosh is to avoid that midstep but
still deliver a sharp image as soon as possible. So I'm not going to
ship that yet because I want to wait until we
fix our decoder bugs, because then I can get to see
exactly what the difference is. But if it's still better,
then we'll ship that. But yeah, I'll put links
to all of this stuff in the description. All right, that's
enough about JPEG. I want to play a little
game, and it's called– SURMA: Oh boy. JAKE: –Guess the Format. And I'm going to show
you an image loading, and it's your job to guess
what image format it is.

SURMA: All right. JAKE: It might be JPEG. It might be something
else, because it could be a different form of JPEG. Anyway, here we go. 3, 2, 1, go. [CLOCK TICKING] What're you thinking? You still thinking? Are you still there? SURMA: I'm trying to think. I think that might be PNG. JAKE: Oh. Well– SURMA: Uh, but, um– JAKE: Well done. No, you're correct. SURMA: Ayy. JAKE: I'll give you that in one. This is the lesser-spotted
interlaced PNG. So this is a feature of PNG
that's not used so much. It uses a form of interlacing
called Adam7, which means it's seven scans. It starts off with a
1/8-resolution image, a bit like the JPEG
that we saw before. And then it doubles the
horizontal resolution, then doubles the vertical, then
horizontal, then the vertical. And you end up with
seven passes in total. GIF has a similar thing,
but it's only four passes, and it only improves
the vertical resolution.

It starts off with the
full horizontal resolution. SURMA: Ah. JAKE: But what I will
say is, don't do this. Don't use interlaced PNGs. I mean, you've been
looking at PNG compression and JPEG XL
compression recently, how it uses the pixel
before to do prediction about the next pixel. This form of interlacing
makes that very hard, so it ends up being
about 20% bigger. It's not great for
the compression. Generally, PNG should
be quite small, so they don't really benefit
from this interlacing thing. I would say, if you're
targeting an old browser where you need the
alpha transparency, but otherwise it's full of
data, so it's a huge PNG, maybe do the interlacing thing. But otherwise, no. Squoosh doesn't
do this right now, but I have a draft
PR ready to go. Do you know the
only reason I didn't push this PR your way is I
thought it would give you a clue to this question. I was hoping you
would get it wrong, but you got it right anyway.

But we'll ship interlaced PNGs
in Squoosh– off by default, of course. SURMA: Nice. I like it. JAKE: OK, another question. SURMA: OK. JAKE: Here we go. SURMA: I'm ready. JAKE: 3, 2, 1, go. [CLOCK TICKING] What could it be? Ah. There you go. SURMA: OK, so there was no
progressive step in that sense. It was full resolution
from the very beginning, just top to bottom.

It has alpha transparency. JAKE: So there was some
form of progression. Like, you saw some
intermediate steps. But as you say, it was
just a top-to-bottom scan. SURMA: That was just a
scan line, basically. JAKE: Yes. SURMA: So this has proper
alpha transparency, not just a mask, so– and the amount of colors– I'd say GIF is out. JPEG is out. It could be PNG. It could be AVIF. I guess it could also be JPEG
XL, but I actually don't know.

I think JPEG XL's
progressive is– the whole point
of their marketing is that it's the same
progressive rendering as JPEG, so it would probably actually
start with a lower resolution. I am going to say it's
got to be JPEG or AVIF. Not J– PNG or AVIF. JAKE: PNG or AVIF? SURMA: I'm not sure how I
would distinguish those two just by the progressive render. I'm just going to put my
money on AVIF because you haven't had that yet. JAKE: That's a reasonable
bet, but in this case, this is a lossy WebP. SURMA: I forgot about WebP. I'm so bad. JAKE: I was going to say,
the one you didn't mention. I think if you thought of WebP,
you would have guessed it was. But it is specifically
lossy WebP. So yeah, top-to-bottom scan,
no fancy progressive rendering like JPEG has. Part of this is because
JPEG is an image format, whereas WebP is an
image format created from the keyframes of
a video format, VP8. And video formats do not
need multiscan progressive rendering.

They just need to display
one frame at a time. So it's not something
WebP can do. In fact, from what I'm told,
it was actually a lot of effort for them to even make the
top-to-bottom rendering work. They had to do a lot of shifting
around of data to make it work. But there was another
element of the load there which is interesting. What I'm going to do here
is I'm going to compare lossy WebP with lossless WebP. Lossless WebP is a
whole other format. It's not related to
the video codec at all. I'm going to start them now. [CLOCK TICKING] Now, the lossy
version finishes first because it's a smaller
file, but notice how it took a lot longer
to get started compared to the lossless version. SURMA: Yeah. And I have a hunch. JAKE: Go on, then. What's your hunch? SURMA: I think
lossy WebP is lossy.

pexels photo 7788009

But one thing that
I have learned throughout our
work on Squoosh is that it's very hard to
lossy-compress the alpha channel. And so if you have
lossy data, you probably have to encode the alpha
channel separately lossless, because otherwise
progressive render is going to look really, really weird. JAKE: That's correct. I would say correct
answer, maybe not entirely the right reasons. SURMA: Ah, OK. JAKE: It's more
of a product of it being derived from a
video format, again. What these formats tend to do
is they encode channels one by one.

And this was the thing that was
difficult for WebP untangle. So they were delivering
things together. But they didn't do that
with the alpha data, so the alpha data sits
at the front of the file. Yeah, and you're correct. It is lossless. It uses lossless WebP
for the alpha data. It sits right at the
front, so that wait that we had at the start was it
loading all of the alpha data before it could start
adding the color data in. SURMA: Which is,
you know, invisible. JAKE: Exactly. [LAUGHTER] And yes. I mean, they could have done
it the other way around, but alpha data
tends to be smaller than the rest of the data.

Whereas lossless
WebP, because it's a pixel-by-pixel compression,
it does the alpha data along with each pixel. They could have found a way to
interleave the alpha data, but, especially with it being a
completely different format, not easy. And lossless alpha does work. It's something AVIF
does quite well. But yes, that's not
how WebP does it. SURMA: Oh, I didn't know it. I didn't know that AVIF
can compress lossfully– lossfully compress
alpha and deal with the artifacts
that might happen.

JAKE: It works really well. So WebP will struggle if
you've got a lot of gradients and stuff in your alpha channel,
whereas that's where AVIF will do really, really well. All right, next format. Here we go. 3, 2, 1, go. [CLOCK TICKING] Any early guesses? SURMA: Um. JAKE: Tada! There we go. SURMA: [LAUGHS] So that is an image
format that has no progressive capabilities. JAKE: That's correct. SURMA: Actually, now that I
think about it, I don't even know why I was considering
AVIF, because I think the whole point
is the AVIF currently doesn't do progressive because,
again, it's a video format thing.

It's probably AVIF. JAKE: You are correct. Yeah. This is it. The AVIF decoders in
Chrome and Firefox– they don't have any
intermediate rendering at all. It's not even clear
whether they can. I think maybe the
same problem again, like the two color channels
are brought in first, maybe. It might not even be
possible for them to ever do any kind of even scanline
or progressive rendering here. And I sped that up compared
to the other demos we've seen. I think AVIF is magic. It's my favorite image
format right now, especially because
it's in Chrome and will soon be in Firefox. I've written a big blog
post about how great it is. But there are some kinds of
images that it struggles with, and this is one of them. I had to throw so many
bytes at this image to prevent ugly,
flat areas appearing, and it ended up bigger
than the JPEG in this case. It's one of those outliers. Not usually how it goes. It's usually a fraction
of the size of a JPEG, sometimes like a tenth of the
size, and it still looks OK.

Not the case here. SURMA: I mean, that's the whole
point of Squoosh, where you can try these things out, because
I think one of the things that we really want to
get people to understand, there is not a single image
format to rule them all. There's different types of
images where different image formats excel at. And so it's often the
case of actually taking the time at the important images
on your website, the big ones, actually take the time
and look at what Squoosh has to offer in the
different formats and use the ones that work
best for that specific image. JAKE: Absolutely, absolutely. So this way that you get
nothing until you get everything with an AVIF– I've been proposing a
way to mitigate this.

And I know you've
seen this already. It looks like this. So this is a 4.3k AVIF but with
a blur filter over the top. So it's a tiny, little image
that can sit at the front, especially compared to
300k for the full image. Now, without the blur, it
looks awful, kind of very alien almost art in some way. But yeah, once you
apply the blur, I think it looks like
a really good preview. And the AVIF standard already
allows multiple images to be in one
container, and those can be tagged as thumbnails. So what we're going to look
at is if there's a thumbnail to start the image, we could
show that early in the browser and then maybe apply
this blur filter to it. All just ideas right
now, but I'm hoping we get something like that. So you'll get a preview
like this straightaway, and then you'll get the final
image once it downloads. SURMA: And, I mean, yeah,
if the final image is 300k, adding another 4k isn't
really that big of a deal.

It's an increase by 1%. So your users can get a
preview of the image much, much earlier,
especially for AVIF. Like, if you're on
3G, downloading 4k will be significantly faster
than downloading 300K. So the time difference where
the user sees something will be massive. JAKE: Absolutely. And I'll put links to the
spec discussion and some more demos of this effect
in the description. All right, one more example. Here we go. 3, 2, 1, go. [CLOCK TICKING] A little gap at the start. Oh, and there we go. And I don't know if
you can see this. You might not see it
on the smaller screen, but we've got kind of these– SURMA: I can see
that there's blocks appearing of higher resolution. JAKE: Exactly. So this is like a two-phase
progression, sort of. You get the low-detail
DCT, similar to JPEG, the 8-by-8 squares. And then it's going and filling
in the full resolution block by block.

This was a JPEG XL. SURMA: Ah, that would
have been my guess. JAKE: Well, it's the one left. SURMA: Because that's
the one that was– yeah. [LAUGHTER] JAKE: This doesn't work
in any browser yet. JPEG XL is behind a flag
in Chrome and in Firefox, but there's no
progressive rendering yet. A little bit like JPEG,
there's multiple ways that progressive
rendering can be done. And that can be done
at the encode time, but also the decoder has some
say in the matter as well. This was the settings
I was recommended. It can do more passes and
different kinds of passes. Yeah, currently in
Chrome and Firefox, it loads more like an AVIF. You get nothing, and then
you get the whole thing. But JPEG XL has been designed
with this progressive rendering in mind.

So hopefully, that's
what we'll get when it lands in browsers proper. So where does this put us? We've got these
three formats that have some kind of
progressive nature. We've got JPEG,
we've got JPEG XL, and then we've got the
hacky idea of the AVIF with an extra
image at the start. SURMA: And no PNG? JAKE: Oh, I'm not
including that. It's rubbish. [LAUGHTER] Image it's so big. It's not worth it. I'm going to load them
all at the same time. Here we go. 3, 2, 1, go. [CLOCK TICKING] JPEG still does really,
really well here. JPEG XL gets to full
detail a lot faster because JPEG XL was the– I could get to a decent
quality with a much lower file size with JPEG XL in this case
in the way that I couldn't with AVIF.

It did a much better job of it. But then, yeah, so JPEG XL
gets to full quality quicker than JPEG. But JPEG got to the lower
quality quicker than JPEG XL. And that really tiny AVIF
with the blur filter– that was there first,
but it takes forever for it to get to
the full version, just because it's such a
huge image in this case. I'm really excited about
the different innovation in this area and different
directions that this is taking. My gut feeling is still
that I will be probably using AVIF for most
images on the web because I'm quite happy
with the kind of loss that I get with AVIF.

It smooths images a
lot, but in most cases– a bit like the image
I started with, like the kid eating ice cream– you don't need full detail. You need it to not
look ugly, but you just need to give the impression of
what the image is as soon as possible, and AVIF tends to do
really, really well with that. But the philosophy of JPEG
XL tends to be more for big, high-quality images, so
things like Unsplash, Flickr– if you've got a news
website, but you've got a dedicated
images page, then maybe JPEG XL is going to be
the better thing to put there.

And you get this lovely
progressive rendering to go along with it, which is– I'm really excited about that. SURMA: I mean, I'm actually also
quite excited to see if and how WebP 2 will fare in the battle
of progressive rendering, because one thing,
for me at least, that sets WebP apart is JPEG XL
was designed with the web mind. AVIF wasn't designed
for the web at all. It was designed for video. WebP is not designed
for the web in mind, but specifically for the web. They made a whole
bunch of shortcuts. It was like, we are trying
to find a format for the web. JPEG XL wants to
work on the web, but also work in
other use cases. And I think that
different scoping will allow them to have
a whole bunch– take different trade-offs than
what JPEG XL is doing. As you were saying,
JPEG XL does really well at high-quality images,
in my experience, not necessarily
at the "let's make it look good with less bytes
at screen size," necessarily.

They do there well sometimes
as well, but not all the time. AVIF, in my experience, does
better in that scenario. But I do wonder if WebP 2 will
make interesting trade-offs and become a contestant here. JAKE: Yeah, and it's really
too early to tell with WebP v2. It's all sort of hype right
now, although we do have it in Squoosh, a preview. But I do like the
way they're heading. They're heading
like, what can we do that's similar to AVIF but
with all of these web benefits on top? And if they deliver on
that, then my favorite will switch from
AVIF to WebP v2, if they deliver
on that, because I know they're looking at all of
the progressive stuff as well.

So yeah, we'll do
another episode on it when there's something
to show there. But it's in Squoosh if
people want to have a play, but it's super
experimental right now. In the meantime, there
are other techniques. I showed you this page earlier,
but with a blurry preview. You mentioned BlurHash. I'll put a link to that. That's a way that you
can create these previews in just a few bytes. In this case, it's 50 bytes. But you also need
to ship a decoder for that, which
is fine if you're a native app, because you
can just put that along with your APK or whatever.

But on the web, if
it's your main image, then shipping a decoder
along with it becomes– well, you have to
factor it in, certainly. It's only 1k. It's super small. 1k with brotli compression. But if you're
talking 1k and then a few bytes for the
images themselves– you know, here's what a 1k WebP
looks like with a blur filter over the top. You get much more detail
for around the same size if you factor in the
size of the decoder. SURMA: And also, it's
JavaScript reliant. It can potentially break or
not even run for some people, and it will occupy
the main thread, while this kind
of image decoding can happen off the main thread.

There's lots of
trades-off involved in shipping your own image
format versus using something that the browser
already supports. JAKE: Yeah, and in
terms of trade-off, it's worth mentioning that
these blur effects are not free either. They can actually
be pretty expensive. So that's always something
to measure as well, versus what kind of trade-off
are you making there with that blur filter? So it's not something I
would use a lot on a page. But yes, really,
one of the best ways you can deal with
this thing right now is just use format with
a nice progressive render. If it's a big image, JPEG
is right there right now. In Squoosh, you can use this
progressive rendering stuff. And, yeah, it's an
old image format, but sometimes it will
give you that best result, especially for images that
are, like, 100k, 200k.

If you can get down
to 20k, 30k with AVIF, then that's a better option. But yeah. SURMA: I find it really
interesting, as you say, JPEG being close to 30 years
old and actually still– it is literally competitive
to the next-gen formats in some scenarios. JAKE: Because it's
designed for the web. I'll also put a
link to a discussion where the Largest
Contentful Paint metric is looking at
how they should consider progressive image rendering,
because they don't right now. So I'll put a link to
that, because they're going to consider some
kind of cutoff point when the amount of
detail is good enough. So that will be
very important when we start getting more
progressive formats, especially things like JPEG XL. But that's all I
wanted to cover. SURMA: Yeah.

I think that was a
good episode, Jake. JAKE: Well thanks, mate. SURMA: I enjoyed it. I learned something. JAKE: Cheers. Hey, do you know what? As long as you press the
little thumbs up icon, that's all I care about. [LAUGHTER] It's going to be one
thumbs up, 800 down. But, oh, now that
I've said that, it's going to encourage
people to click it. You know what? Never mind. Edit this bit out. Just don't show this bit. Lucas, cut this bit out. Cut this bit out. Definitely cut this bit out.

Bye. SURMA: [LAUGHS] Bye. JAKE: We've talked a lot
about the size of images and how to make them small. SURMA: Ah. OK, I was going to
say, I was wondering where you were going
with ice cream, but yes. JAKE: Well, probably you
don't have that slide on the screen yet, so– SURMA: I mean,
that's up to Lucas whether this reference
works or not. JAKE: It's going well. But I wanted to
dive a bit deeper and look at the different
styles of loading an image.


You May Also Like