Issue Details (XML | Word | Printable)

Key: CAA-17
Type: Improvement Improvement
Status: Closed Closed
Resolution: Fixed
Priority: Normal Normal
Assignee: Robert Kaye
Reporter: Ian McEwen
Votes: 3
Watchers: 1
Operations

If you were logged in you would be able to see more operations.
Cover Art Archive

Thumbnailer-bot should have higher quality settings to minimize JPEG artifacts

Created: 04/May/12 05:54 PM   Updated: 15/May/12 12:39 AM   Resolved: 15/May/12 12:39 AM
Component/s: None
Affects Version/s: None
Fix Version/s: None


 Description  « Hide

The archive.org thumbnailer bot is seemingly running with a low quality setting for their JPEG compression, and possibly also some blurring. This means that thumbnails are ending up with substantial JPEG artifacts, even when they are high-quality initially. This is especially bad for text and for vector-graphics-style blocks, as you can see in the examples below (generally: anything with sharp edges). A lot of text ends up unreadable or close to it, especially in 250px thumbnails.

I'm not sure what the archive is using, but they should up the quality setting, whatever it is – less blurring, better JPEG quality settings, or perhaps they just have a size limit that should be increased so their process can use higher-quality compression.

My own experimentation on the topic follows, with some examples at the bottom:

If they're using imagemagick's mogrify, the option they need is likely '-quality': http://www.imagemagick.org/script/command-line-options.php?ImageMagick=t820u50n41e6sb4blf1phj31j1#quality – their thumbnails don't look the same as my experimentation with mogrify, though, so I'd guess not. Their artifact level seems between those images with quality set 25 and 50 (out of 100). mogrify's default is "whatever it can detect, or 92".

After some experimentation, the sizes of the resulting files for 25, 50, 75, and default quality, are: 8K, 8K, 12K, and 60K for the first example below (9.7M to start), and 32K, 36K, 44K, and 60K for the second example below (396K to start). The archive's files are around 10K, so I'm impressed with the small sizes they're getting, but the artifacts are quite bad, especially since we're using thumbnails for most places we're displaying anything. They may be doing some sort of blur first, as well; the black text is appearing much more gray in their images.

Examples:
600dpi scan, with text: http://ia601204.s3dns.us.archive.org/mbid-18d9bb0c-2cba-47b3-b9ce-da770c6f0cc9/mbid-18d9bb0c-2cba-47b3-b9ce-da770c6f0cc9-834309549_thumb250.jpg
Original: http://ia601204.s3dns.us.archive.org/mbid-18d9bb0c-2cba-47b3-b9ce-da770c6f0cc9/mbid-18d9bb0c-2cba-47b3-b9ce-da770c6f0cc9-834309549.jpg

Vector graphics + rendered CG: http://ia601207.s3dns.us.archive.org/mbid-081ff32e-7f3a-4b0f-85a0-cf247dd54f5d/mbid-081ff32e-7f3a-4b0f-85a0-cf247dd54f5d-838290502_thumb250.jpg
Original: http://ia601207.s3dns.us.archive.org/mbid-081ff32e-7f3a-4b0f-85a0-cf247dd54f5d/mbid-081ff32e-7f3a-4b0f-85a0-cf247dd54f5d-838290502.jpg



Sort Order: Ascending order - Click to sort in descending order
Robert Kaye added a comment - 04/May/12 06:46 PM

This issue has been raised with IA. I also suggested that if they couldn't find time to work on this, one of us could work on it. We'll have to see what they say.


Václav Brožík added a comment - 10/May/12 11:00 AM

Yes, that is a pity that small text becomes unreadable in thumbnails. Below is an example with a testing thumbnail of a higher quality.

the release page:
http://musicbrainz.org/release/1b6d2cc4-8e5e-4a4a-9fea-68296972d9e7

thumbnail (250x250 pixels, quality 60, ~8.4 kiB):
http://ia701202.s3dns.us.archive.org/mbid-1b6d2cc4-8e5e-4a4a-9fea-68296972d9e7/mbid-1b6d2cc4-8e5e-4a4a-9fea-68296972d9e7-910858148_thumb250.jpg

original quality (300x300 pixels, quality 97, ~53 kiB)
http://ia701202.s3dns.us.archive.org/mbid-1b6d2cc4-8e5e-4a4a-9fea-68296972d9e7/mbid-1b6d2cc4-8e5e-4a4a-9fea-68296972d9e7-910858148.jpg

I tried saving the 250x250 thumbnail after slight sharpening with quality 75 and the text is well readable with size of the file ~12.4 kiB:
https://docs.google.com/open?id=0B-WJ-EDQFnk4Wm40OFZXUC1KcXM

Unfortunately nikki said that rising the quality for 250x250 thumbnails is improbable because it is a system-wide setting of archive.org.
See http://musicbrainz.org/edit/17561822


Robert Kaye added a comment - 15/May/12 12:39 AM

Thanks to Alexis for making this happen!