Thursday, January 22, 2015

Working with Photo Metadata

Have you ever wondered how google knows that a given image is relevant to your search when you use image search?

They look at a couple of different things - like file name and the content of the webpage around it - but one of the big factors is the image's metadata.

What is Metadata?
Metadata is information that is attached to a file on the web (such as an HTML webpage or a photo) that is usually not visible to the general public, but can be read by search engines or viewed using certain software.
There are different categories of metadata - it can be embedded within the image itself or attached as an accompanying file.  It is possible to add metadata to an image in Windows by right clicking and editing the image properties; however, this data may not travel with the image when it is transferred off of your computer and onto the web. On webpages, you can also add metadata as a part of the HTML tags. For example, [img src="IMG URL" alt="alternate text here is a form of metadata"].

Why is Metadata Important?
Metadata is used as one factor (out of many) by search engines to determine what is on a page and if it is relevant to a given search. From what I've read, embedded metadata is not the most important factor for the major search engines - the alternate text added on the webpage code is FAR more important. However, things always change, every little bit helps, and there are other reasons why embedded metadata can be important.

Image metadata is used by those who use screen readers or are blind to cue them into what is in an image they cannot see. Alternate text metadata is considered one of the more important forms of metadata for search relevance and screen readers alike.

Embedded image metadata can be extremely important in terms of establishing authorship and copyright. There's even some talk on Capitol Hill of changing copyright laws in regards to online photography. (The proposed change would basically make photos fair use if they don't have information with them about who the copyright owner is and how to contact them - this is information that can be recorded with metadata. It's also why some people are not happy when facebook and twitter strip metadata from images.)

Some online image services read the embedded metadata and use it to help describe, organize, and administer images. For example, WordPress uses it to help you administer all the images on your site. Flicker also uses it. I don't think Pinterest uses it, but I could be wrong.


It's pretty easy to embed metadata using either Adobe Photoshop or GIMP, however.

Adding Metadata in GIMP and Photoshop
Assuming you have at least version 2.8, all you need to do is go to File -> Properties in GIMP to edit an image's metadata. Photoshop is similar, but the command is File -> File Info.


From there, it brings up a screen like this one:
Metadata Dialog in GIMP
Metadata Dialog used by Adobe
The screen has multiple tabs. The main description area lets you add a title, author, description of the photo, the name of the person who wrote the description, and you can add keywords. You can also use the dialogs to indicate if the image is protected under copyright, and by whom. 

Adobe's dialog (used in Photoshop and Lightroom) goes further allowing you to add IPTC and the Extended version of IPTC data. (They are both standards adopted by publishing conventions and organizations to standardize how this stuff is recorded. The extended version came about to catalog more information for certain groups - namely the stock photography industry and those who document cultural heritage.) Adobe also allows for custom tags so organizations can document every single person who has touched an image and where any component of that image came from. 

From a business perspective, the thing you're likely to be most concerned about is the copyright and contact information. I have actually had someone contact me concerning the copyright for an image before - they wanted to use it as the basis for a sports team logo in Eastern Europe. This goes to show you that you never know where your images will end up on the web, so it can be beneficial to leave a trail leading back to you. 


Both Photoshop and GIMP allow you to import and export XMP files (which are essentially templates for filling out all these fields). I recommend creating one that lists all your basic information that you can upload quickly onto any photo you work on. From there, you can add more information if you feel like it is necessary. 

Removing Metadata
If you look closely, you'll see that information about your camera type and even your GPS location can and will be stored in the metadata. If you sell high end goods, like jewelry, you might not want people to know exactly where you are taking your photos. In this case, you might want to consider running your images though Metability's QuickFix to scrub the metadata off of your images (preferably before you bother adding descriptions and the other bits of information that you want attached to the file).

You can also edit and remove metadata with Window's Explorer by right clicking on the image and selecting Properties, from there, go to details and select "Remove Properties and Personal Information". This will open up a menu that will allow you to remove specific properties from the file. 

If you don't have windows, Google's Picassa program will allow you to edit and remove metadata.

7 comments:

  1. Interesting post, Michelle. I have actually looked at the metadata from some of my pics on Picassa and Flickr, especially when i was trying to remember when I took a specific photo. But did not realize you could edit the info. Thanks for clearing some of the mysteries of metadata up (:

    ReplyDelete
  2. Very informative, Michelle! I'll have to check into this more for my photos. Thanks so much! :)

    ReplyDelete
  3. By the way, I mention in the post that the search engines place far more importance on the alternate text code added to a website than embedded metadata. The reason is because the embedded metadata (unlike the alternate text) is not readily visible to website visitors - this makes it a potential candidate for keyword stuffing. Google doesn't want to encourage stuffing, so they weigh the alternate text much, much higher than embedded data for search relevance. (I've even heard one person claim that google doesn't consider embedded data at all for search relevance - that it only looks at the alternate text data entered on the website's HTML code.)

    Again, the embedded data is still useful for establishing authorship and if you use Wordpress to help organize photos. However, I'm not sure it's worth the time to spend a lot of effort trying to optimize for keywords in the descriptions. (The way things are currently set up, at least.)

    ReplyDelete
  4. I have to admit this information is new to me. Wow, there is always so much more to learn. For now I will just tuck this metadata stuff away in my head until I'm ready to explore this new information. Great post!

    ReplyDelete
  5. Thank you for writing an informative post, Michelle. I always learn something new here.

    ReplyDelete
  6. Wow! I never knew this, but thanks to your article I will be adding this data to my photos from now on!

    ReplyDelete
    Replies
    1. If you have access to Photoshop, you can use Batch Processing in combination with macros to add your authorship data to all of the photos in a given folder automatically. This can save you a TON of time.

      Delete