Category Archives: Technologies

GeoSetter on photos from Nikon NEF

So you’ve done a photo shoot in RAW, spent time adjusting each photo’s composition, and saved the final results to a set of JPGs. Now you want to geo-tag the set, GeoSetter appears to correctly geo-locate each photo, all looks good until you save changes and get:

Warning: Unknown format (800) for SubIFD tag 0x0
Warning: [minor] Entries in SubIFD were out of sequence. Fixed.
Error: Bad format (800) for SubIFD entry 0

Now you can’t save the updates, you can’t GeoTag your photos. Ok, so we try Google’s GPicSync, but it also fails. If you enable storing the Lat/Lon as keywords, that does work, but neither can seem to modify the lat/long of the image.

So we need to strip out the offending unrecognized data for these programs to correctly store the location, but keep the neccessary information such as time stamps. And we need to do so in a batch process, we don’t want to have to hand edit the EXIF data for each image. Many of the high level photo management tools (including GPicSync and GeoSetter) depend on the low-level tool ExifTool.

Using the Window’s GUI interface to this tool, it will report the Nikon information as warnings, data it does not understand. Unfortunately, I saw no way to tell the tool to remove that block of data from the image, only to modify data within sections it understood.

What did work was the tool ViewNX which is provided by Nikon.  Select the images and from the File menu click Convert Files… Check the Box “Remove XMP/IPTC information”, leave the others unchecked and click Convert…

Now the unknown parts are removed, and GeoSetter will have no problems re-saving the files after geo-tagging.

rsync, basic is best

I’ve tried countless backup programs over the years in search of the best solution for my needs. Virtually all of them do the standard full, incremental or differential backups. This is fine for archival purposes, since you can recover all documents lost after a failure.

The downside is that, unless you do a full backup each time, in the case of a failure you have to piece together the last state of the files combining the full and incremental backups. This manual piecing together may require a lot of manual merging if files have been moved between incremental backups that result in two copies in the recovered file system. Similarly obsolete files that were purposely deleted will be restored, since file deletions are not recorded by incremental or differential backups.

What I want is essentially a full backup or mirror each time, so that the backup always represents an exact copy of what I’m backing up. Then in the case of a failure, it is a simple file copy to a new disk to restore, or in an emergency just use the backup directly since it is identical to the original.

Backing up terabytes of data is still to slow and costly to keep a sequence of full backups. Mirrors can be kept in sync quicker, but if you accidentally delete a file and it gets mirrored, you lost it in the backup as well. What would be ideal is mirroring where any deleted or changed files are kept in a side folder. The main mirror folder always contains an exact copy of the original but any previous revisions or deleted files can still be recovered. This is similar to what Time Machine does for OSX, but rsync can do this for linux and other platforms using related projects.

Wading through the large number of options for rsync can be intimidating. Its a powerful tool and so can be disastrous if incorrectly used. You don’t want to accidentally mix source and destination with the –delete option for example. Try out the options in a sandbox first until you see how it works. A good first step may be to use a graphical front end such as grsync where all the options are clearly labeled with context help.

The mirroring command I’d use to backup a drive called nas1 would be as follows:

rsync -r -t --progress --delete -b --backup-dir=/mnt/backup/$(date +%F)/nas1 /mnt/nas1 /mnt/backup/nas1

  • -r to recurse through the directories
  • -t to preserve the timestamps of the original file
  • –progress to display the progress
  • –delete to remove deleted source files from the destination. This makes it a true mirror, identical to the source
  • -b backup changed and deleted files from the destination
  • –backup-dir where to move the backups, in a date specific folder. Files on the mirror which are being replaced or deleted will be moved here instead, so they can be recovered if needed.
  • /mnt/nas1 the drive being backed up
  • /mnt/backup/nas1 the folder to place the mirror image into

What this results in is the drive ‘nas1’ mirrored into the folder ‘nas1’ on drive backup. Previous versions and deleted files moved to a date specific folder such as 2009-04-20/nas1 on drive backup.

Virtual NAS: better in theory

I was interested in setting up a network attached server (NAS) without the added expense or clutter of another system running. The concept of a Virtual NAS seems simple, run one of the freely available NAS virtual machines, connect some spare USB drives to the machine, do some simple web based configurations, and you’re done.

The attractiveness of such a system:

  • has all the file transfer protocols you could want already installed
  • isolates the NAS from a host system for security and configuration
  • the virtual machine can quickly be moved to another host in case of a host failure

The two systems I tried were FreeNAS and openfiler, using the pre-made virutal machine images. My setup consisted of three USB drives, 1TB and two 500GB that were given the network share names NAS1, NAS2 and NAS3 respectively. There were a few criteria I desired: reboot robustness, reliability, recovery, speed, and ease of use.

Both had issues of reboot robustness, though the problems with openfiler seemed worse requiring all drives to be re-setup after a power failure. This main issue is more related to the order in which USB devices are discovered by the host system or VMWare on boot. If say drive 2 fails to be discovered at startup due to a drive problem, then the NAS assumed that drive 3 was drive 2 and that drive 3 was missing. This caused problems for network machines mapped to the name ‘NAS2’ but were really connected to NAS3.  It seems the NAS should have a way of mapping by a hardware identifier.

On top of the reboot reconnection issue, the speed of the web UI for openfiler made it a painful experience to do the reconfiguration after the reboot failure.

After rejecting openfiler for the rebooting issues, I went ahead with FreeNAS and started pushing large amounts of data to the server. Unfortnately the system would freeze up after a period of time, on the order of hours, requiring a reboot of the virtual system and restarting of the transfers. This didn’t seem to be due to a specific protocol, as I tried SMB, FTP, and SSH. Perhaps the 256MB system memory allocated to the VM was not enough, but I made no changes to the VMs being tested.

In addition having to restart large transfers after a failure, the speed of the transfers were slow, averaging 4-5MB/s. This takes on the order of days to fully transfer terabytes of data. This compares to an average of 15-20MB/s when transfering to the host machine over 1000Mb ethernet.

Finally, it would be nice, in the case of a NAS failure to be able to plug the USB drives into another machine and get the data off in an emergency. Openfiler supports ext3 so those drives could be plugged into the host Ubuntu system and read, but FreeNAS uses UFS which Ubuntu does not natively support.

So for the home user, it seems as though the potential problems don’t make it worth doing just yet. There are probably ways of working around some of these issues if you want to create/modify the virtual machine properties, but this was a quick and dirty, as-is test. For now, a USB drive mounted on an Ubuntu server with the folder shared is fast, reliable and just works.