Gallery Automation Project (nearly) Done

NOTE: All of this has been massively superseded (here in 2010) with WordPress and Zenfolio together with Lightroom. Again, I leave this here for “historical” purposes…

I’ve been working off and on for a week or more on a major workflow automation project for the blog. Let me start off by quickly pointing out what this means for the readers in terms of new features:

  • Every photo gallery can now be viewed in either of two forms: HTML gallery or flash-based slideshow
  • Links to both are embedded in the articles now as dropdown lists (“View photos as…”)
  • The number of photos and pages (for the HTML gallery) are listed below the dropdown list
  • An index of the galleries exists both in HTML and slideshow form
  • The dropdown list below the random image in the top right of every page will take you to the indexes as well as the gallery categories
  • The random image box in the top right by the banner now runs the slideshow or, if you don’t have flash installed, my original random image script requiring no plug-ins

So what’s the “workflow automation” all about? Well, generating galleries and linking them to a blog article is already a multi-step, albeit relatively easy, process. The addition of the slideshows, though, added a significant extra layer of work in the form of carefully structured XML configuration files. Since I want to be consistent and I don’t want a ton of extra work to stymie the urge to post new galleries, I figured it was time to streamline the process as much as possible. Past experience has told me that once the posting process becomes too complex, spur of the moment and casual posting grinds to a halt. I want the process to be trivial enough that I won’t have any excuses not to post more frequently. 🙂

The behind-the-scenes story of how this process works is, of course, disturbingly geeky. Fortunately, there are plenty of other fellow geeks out there that might be interested. The rest of this piece (and I predict it will be rather long) will cover the gist of what went into this project thus far. If the binding together of iView MediaPro, AppleScript, linux, OS X, ssh, scp, a big chunk of python code, MySQL databases and various bits of HTML and XML technology interest you… read on.
Here’s a linked list to the various tools and technologies use:

Detailed description below the fold…

In a nutshell, making a photo gallery for this site is a pretty straight forward process. I currently do all of my photo management with iView MediaPro on the Mac.

click to enlarge

It’s a fantastic piece of software, all in all, and I manage literally tens of thousands of images with it across multiple drives and servers. I use its templatable HTML gallery feature to generate the halfpress galleries. By doing this I have a consistent means of pushing any catalog I create in iView out with just a few clicks of the mouse. I generally just write them out to a specific directory within the structure of the blog and then code a link into the article.

The random image box at the top right of the halfpress pages was one of my earlier developments. Using iView’s Text Table feature, I could write out a text file in CSV format that contains the filenames, captions, etc., of the images in the current catalog. I would then feed that CSV file through a CGI interface I wrote in python on the halfpress server. This script would parse the data into a MySQL database that had all the information needed to build HTML links to the thumbnails and images in any gallery. The database doesn’t need to contain the photos… just the paths and filenames as they appear on the server. The little CGI script that selects images for the page simply makes a random selection of a photo from the database on each page load, builds the relevant HTML to display the existing thumbnail and embeds the image-specific link into the gallery. Voila… lightweight random images that are clickable taking you right into gallery it came from, and without duplicating any of the assets.
So, in a nutshell, the pre-automated workflow process was:

  • Bring new photos into iVIew
  • Sort photos and make selections
  • Open photos from iView into Photoshop for editing
  • Final cataloging of edited images in iView
  • Generate HTML gallery with iView and write it via AppleTalk to the server
  • Create a CSV dump of relevant catalog information from iView
  • Feed the CSV file through a private CGI interface on the server
  • Write the new blog entry and link to the gallery

Once I became attracted to using the flash-based SlideShowPro, though, the complexity went up a bit. Rather than go all flash, I want every gallery to be available in both forms so the user can choose what they like best. This means I do everything listed above plus create an XML configuration file for SlideShowPro that includes the somewhat complex, server-specific paths, captions, etc. I’d then be creating two links in the text to the two viewing options.

My first attempts at automating the XML creation became possible with the new iView MediaPro 3.0 release. It has the ability to not only write out XML dumps of catalog data, but also the option to template the XML via XSLT which is, basically, the CSS of the XML world. I wrote an XSL template that did much of the work of making a SlideShowPro configuration file, but it was still going to require per-gallery hand editing of the XML.

The Automated Workflow
Basically, what I’ve ended up with is a combination of AppleScript on the desktop and Python on the server with MySQL in the middle to hold the data.

Here’s how the process goes:

  • A final catalog is created as per normal (import, sort, edit, caption, etc).
  • I mark any photos from the catalog that I want included in the random image box pool with a ‘1’ (red label) within iView
  • I choose one photo to be the thumbnail representation for the slideshow index by marking it with a ‘2’ (green label) iVIew

    click to enlarge

  • I launch my AppleScript using iView’s scripting menu
  • The script asks me for the directory name that should be used for the new gallery on the server (i.e. “st-thomas” for the Saint Thomas, USVI gallery.

    click to enlarge

  • I’ve broken the logic of the workflow into stages. When making changes to existing galleries it might be more efficient or desirable to conduct certain parts of the process and skip others. The actions I wish to skip can be de-selected from the list before continuing. Correcting a typo in a caption, for instance, would only require the HTML Gallery, XML and parsing stages while skipping the redundant creation of slideshow thumbnails.

    click to enlarge

  • The script now runs through the selected stages and ends with a dialog box notifying me of successful completion.
  • The final step is to write the blog article. The photos are linked in by adding a single HTML server-side include (SSI) that injects the dropdown box with the image and page count automatically computed (the code is created by the server’s Python script described below). To continue the example from the St. Thomas photos, the include would yield this (with both javascript and non-javascript methods):
    <!--#include virtual="/galleries/st-thomas/include.html" -->

  • Done.

Behind The Scenes: Desktop
What’s happening behind this script is fairly complex and, despite initial appearances, more stable than you might think.

The AppleScript does a series of things:

  • Prompts for the ultimate gallery directory name on the server
  • Prompts for the stages of the workflow you wish to conduct
  • Builds a tree of folders on the desktop to hold the data on its way to the server
  • Tells iView to build an HTML Gallery using a pre-defined option set for halfpress (template, etc)
  • Tells iView to build appropriately sized thumbnails for the flash slideshow interface to match the catalog
  • Tells iView to do a proper XML dump of the catalog data
  • Because the line endings on the Mac don’t match the Linux standard, I call an awk script to rewrite the XML dump with proper CR/LF settings (irritating, but handy)
  • SCP is called to transfer the XML file, thumbnails and HTML gallery to the server
  • SSH does a remote shell call to the server to execute the python “parse” script

click to enlarge

This is my first hands-on experience with AppleScript and, frankly, I’m not especially fond of it. I spent a lot of time fighting with the syntax and screaming about the lack of coherent documentation. I think this is one of those cases where the “english” syntax is frustrating since I’ve already spent my life thinking in the more abbreviated, algol-ish, logic-driven syntax of various programming languages.

The use of SSH and SCP was a matter of convenience and relative security. I didn’t want to store passwords in any of the scripts, so I used the public-key, “passwordless” login capabilities of ssh to pull this off smoothly. I have it highly restricted, so I think it’s a sane tradeoff… not to mention everything is encrypted with 2048 bit keys. This is also testimony to the power of the UNIX underpinnings of OS X that I can call these tools without a second thought and interact with the Linux server transparently.

Behind The Scenes: Server
When I first started thinking this project through, I considered doing the MySQL work on the desktop side. I quickly decided, though, that I’d separate it out so only those actions most directly related to the desktop’s real role (photo management) would reside here and the rest would be done on the server. This also makes it more likely that everything will continue to work nicely when I’m traveling and behind bizarre proxies and firewalls that might crap out the MySQL connectivity. As it stands nows, if the SSH and SCP stages were somehow blocked when traveling, I can hand-execute the server side with a single statement once I get the files transferred by other means (FTP, for instance).

The server side of this process is all encapsulated in a fairly efficient Python application I wrote that manages all the file creation, filesystem manipulation and MySQL interaction.

The system is bound together with a core MySQL database consisting of two simple tables. One table contains a basic index of the galleries including the filesystem path, the formal gallery title and the comment that should appear in the slideshow index. The other table has an entry for every image in every gallery on the system including the filenames, paths, captions, EXIF data, display order within the gallery and a flag indicating whether a given image should be included in the random pool or not. Note that there are no images in the database itself… just references to the existing thumbnails and full sized images as they appear in the gallery. This makes it far more efficient on several levels.

Basically, every incoming gallery carries with it an XML dump of the catalog that contains everything I want to share: captions, EXIF metadata, etc. My python application, when called from the AppleScript system via SSH or on the command line, does the following:

  • instantly drops any existing copy of a gallery by the same name from the database to avoid needing to do comparative updates
  • Using python’s xml.dom.minidom library, it parses the XML output from iView and extracts a pre-defined series of fields that I wish to store in the database:

click to enlarge

  • EXIF-formatted dates are parsed into more traditional, readable date formats for use in the captions
  • ID numbers are cleared and re-assigned across the entire contents of the database to assure there are no gaps for the older random number script’s selection logic
  • Any images with the random flag are selected from the database and an XML file is generated to drive the new random instance of SlideShowPro that appears in the top right of every page
  • Attention then turns to the new incoming gallery that triggered the python application call. This includes writing out a gallery-specific XML file for SlideShowPro complete with with links, captions, thumbnail references, etc.
  • Case is changed on the filenames where required on thumbnails generated by iView and symbolic links are created within the filesystem to common HTML files shared across the galleries
  • The number of images in the gallery is counted from the database and divided by the standard number of images per page that I use in my template. The floating point comparisons are done to determine the page count
  • A small, gallery-specific bit of HTML is written out containing the javascript and non-javascript links to call the gallery from the articles. Also included is the text with the image and page count. It is this little HTML file that is pulled into the written articles with a server-side include to provide the dropdown box for selecting the gallery style
  • Finally, a master XML file encompassing all of the galleries is written out. This file is the one that drives the all-encompassing instance of SlideShowPro with its own internal, thumbnailed index. This allows the user to peruse all the galleries via SlideShowPro and complements the blog’s native HTML gallery index

Notes and Comments

As with any system like this, the design is predicated on a consistent layout to the filesystem. Everything is stored in a standardized format within the core database and, because the filesystem is laid out in pre-defined manner, HTML links and other references can be built on the fly by the python script for use in the XML and the server-side includes.

Since all the galleries exist within a certain directory, all that is needed to be known is the name of the enclosing directory and the individual names of the images within. The thumbnails and full-sized images use the same filenames and are stored in parallel directories and scaled to their required sizes. Since iView puts out its HTML Gallery using a set layout, a single filename reference in the database can yield either the thumbnail or the full-sized image just by pre-pending the enclosing directory names.

In the St. Thomas gallery, for instance, image “_mg_1346.jpg” is stored in the database with a gallery name of “st-thomas”, its relevant caption, EXIF data, etc. A reference to this file in any of its incarnations is easily built as follows:

click to enlarge

SlideShowPro, as stated before, is a flash application that gets it marching orders from an XML configuration file. I’ve described aboved how the XML files are generated based on the consistent design of the filesystem layout and HTML gallery assets.

I’ll mention that I have at least three copies of the SlideShowPro app “published” from Flash Studio 8.0 to fit each need: the random slideshow, the master slideshow gallery with index, and the individual gallery slideshows. The individual galleries share a single instance of the swf file and read the XML file based on a path-relative call.

Closing Thoughts

To summarize, my workflow is now:

  • Make the catalog
  • Run the script
  • Write the article
  • Embed the single line server-side include to link to the gallery options

That’s pretty much it. All of the galleries, random images, slideshows, slideshow indexes, etc., take care of themselves as a result of the script(s). I can focus on shooting, editing and far less on gluing the blog together.

Over time I’ll continue to test, tweak and likely make changes to the overall design. This is just the first functional incarnation of the concept and it will probably improve over time. I don’t claim that the code is the most efficient thing on the planet nor do I assume the overall design is nearly as ideal as it could be. That’s half the fun, though, since there are a zillion ways to approach any one problem with these kinds of tools.

If anynone has any questions or comments, don’t hesitate to email me or use the comment fields below.

5 Responses to “Gallery Automation Project (nearly) Done”

  1. Wow, really cool. I use iView media myself and love it. I do a lot of custom templates but nothing to the extent that you have done.
    Congratulations. Looks great. The only thing I would change is to make the iView export look a little more slick like the rest of the site.

  2. Aaron says:

    David – thanks! And, yes, I agree entirely concerning the gallery template being kind of bland… it’s definitely something I’m going to focus on soon! 🙂

  3. RobK says:

    This is great!
    I was wondering what program you are using in the last screen shot?

  4. Aaron says:

    The last screenshot in the post is the top part of the window for MySQL Query Browser. It can be found here:

  5. john says:

    A year or so later, I wonder if you’re still following this method and what you now think of it.