I’m scanning and organizing my photo collection and I’ve discovered some important principles to follow in tagging them.

First, I want to make sure that all the important metadata is stored in the photos themselves, rather than solely in the DigiKam database. This ensures that I can open or manipulate my photos in any application without losing the effort I’ve put into organizing them. DigiKam can easily be configured to do this seamlessly in the background, but there are some implications for the decisions I make about file formats and tags:

1) I had to convert many files from .tif to .png. DigiKam can’t store metadata in .tif files. PNG is good because it is lossless and open.

2) The way that tags are presented in DigiKam is not quite the way they are stored in the Keywords metatags. DigiKam shows a hierarchy of tags as if the tag format were stored as level1/level2/level3. In the actual metadata which other applications (like Nautilus) use, it looks like the tags are stored as a comma-separated list: level1, level2, level3. This becomes a problem when you have multiple nested tags attached to the same image because a comma is also used to separate the lists of nested tags i.e. level1a, level2a, level3a, level1b, level2b, level3b. If you create nested tags with the same names, they will get mixed up when you try to recreate the tag hierarchy in some future application (or when another user tries to use DigiKam to look at your photo archive — which is how I discovered this).

Here is an example of what I mean: let’s say you have a photo which was taken of your friend Bob at your other friend Jennifer’s birthday party. You want to tag the photo so that you know who is in it and where it was taken. You create a tag tree like “people/friends/Bob” and another “events/birthday/Jennifer”. In DigiKam, you can now pull up this photo either with all Bob photos or all Jennifer’s birthday photos. This would be fine if you don’t also have “people/friends/Jennifer” or “events/birthday/Bob” tags. When DigiKam or another application tries to recreate the tree (or just tries to use the tags without imposing a hierarchy), your photo tagged “people, friends, Bob, events, birthday, Jennifer” will potentially look like a photo of Jennifer at Bob’s birthday party which means I’ll never find that photo when I want to see it later.

The solution to this? Make sure every tag makes sense by itself, even without its tree context, and is unique. In the above example, I would create tags “people/friends/Bob” and “events/birthday/Jennifer’s Birthday”. I would also select the “Toggle Auto/parent” option, which will ensure that when you pick a sublevel tag, all of the superior levels will also be selected.

If you want to check what is actually being stored in your photo files, the following command is helpful:

exiv2 -pa filename

If you check, you’ll see that there’s an “Xmp.digiKam.TagsList” metatag which stores the tags using slashes to separate the tree levels, but the other keyword tags “Iptc.Application2.Keywords” and “Xmp.dc.subject” separate the tree levels with commas, allowing for the confusion in programs other than DigiKam. In my case, I think the reason that my tags ended up confused is because I didn’t always have all of the parent tags checked, causing the “Xmp.digiKam.TagsList” to contain things like “events, Jennifer, friends, Bob” rather than “events/birthday/Jennifer, people/friends/Bob”

I highly recommend this primer on Digital Asset Management in DigiKam. I found it very helpful in sorting out my thinking about all sorts of issues with DigiKam.

More on photo metadata here.