Preserving Image Metadata in Go
I’ve been digging into the world of image parsing the last couple days, after fixing image orientation issues while introducing a new issue: image metadata (particularly for color correction) getting lost during processing.
In short, Go’s standard
image/jpeg library tosses out metadata when you decode an image to transform it, as we do for images uploaded to Snap.as. So preserving that metadata means first parsing it out, running the transformations, then writing it back to the image when you encode it with the new transformations. Should be straightforward, right? Oh let’s see…
Usually I’d lean heavily on existing libraries to do this for me, but surprisingly, there aren’t a ton of options out there for this specific use case: preserving metadata. There are a ton of libraries for parsing EXIF data, which is useful, but way too heavy for my straightforward situation. I tried out some libraries hoping they would quickly solve my problem before realizing I’d just need to figure out how JPEGs actually work.
In short, every JPEG file includes a series of data segments that include everything from the image itself, to color information, to GPS coordinates of where you took the photo. There are a number of
APP1, … segments that all hold different information. For the color profile information, I was interested in
I started with this ICC profile extracting library (found on Stack Overflow) to pull out the information I needed. It took a while to get the data out in the raw form I needed, including two bytes about the size of the segment, so I could copy it perfectly to the final image. Then I tried following the advice on an old Go issue to insert the
APP2 data in the correct spot while encoding the JPEG. Now that I’m writing about it, it just occurred to me I was probably inserting that data at the wrong step in the process. But at the time, I had trouble figuring out why the image was always coming out corrupted, so after swimming around in a hex editor for a while, I just forked the
image/jpeg package and added a few lines to insert the data during encoding. It worked!
To wrap things up, I cleaned up my new snapas/img library a bit, added some tests, and included the ability to pull out those common
APP segments we’re interested in, in their raw form. We still need to be able to combine multiple segments of the same type when encoding, but for now, the functionality is there on Snap.as: backend orientation correction and resizing, with color correction data retained.
(Special thanks to Dave for pointing out the image issues and helping out!)