<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Paper Jammed &#187; Geeky</title>
	<atom:link href="http://paperjammed.com/tag/geeky/feed/" rel="self" type="application/rss+xml" />
	<link>http://paperjammed.com</link>
	<description>Has paper taken over your life?</description>
	<lastBuildDate>Fri, 06 May 2011 00:09:18 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.1.3</generator>
		<item>
		<title>A couple of AppleScript droplets to tweak EXIF timestamps</title>
		<link>http://paperjammed.com/2011/02/14/a-couple-of-applescript-droplets-to-tweak-exif-timestamps/</link>
		<comments>http://paperjammed.com/2011/02/14/a-couple-of-applescript-droplets-to-tweak-exif-timestamps/#comments</comments>
		<pubDate>Tue, 15 Feb 2011 04:52:45 +0000</pubDate>
		<dc:creator>Tad</dc:creator>
				<category><![CDATA[Software]]></category>
		<category><![CDATA[Workflow]]></category>
		<category><![CDATA[AppleScript]]></category>
		<category><![CDATA[Files and Folders]]></category>
		<category><![CDATA[Geeky]]></category>
		<category><![CDATA[Macintosh]]></category>
		<category><![CDATA[Macros]]></category>
		<category><![CDATA[Photos]]></category>
		<category><![CDATA[Scanning]]></category>
		<category><![CDATA[Scripting]]></category>

		<guid isPermaLink="false">http://paperjammed.com/?p=1114</guid>
		<description><![CDATA[Most of the time I don&#8217;t really bother with the timestamp information that my camera embeds in each digital photo. In fact, I can&#8217;t remember the last time I checked to see if the clock was right. Scanned photographs are an entirely different brew. They typically represent events from the distant past, and scanner software [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignright size-medium wp-image-1116" title="iStock_000010531463XSmall" src="http://paperjammed.com/wp-content/uploads/2011/02/iStock_000010531463XSmall-300x193.jpg" alt="" width="300" height="193" />Most of the time I don&#8217;t really bother with the timestamp information that my camera embeds in each digital photo. In fact, I can&#8217;t remember the last time I checked to see if the clock was right.</p>
<p>Scanned photographs are an entirely different brew. They typically represent events from the distant past, and scanner software EXIF data is hit or miss.</p>
<p>I looked for commercial software to handle a few special cases of EXIF data troubles, but came up empty handed. So I wrote a few useful AppleScript droplets that do these tasks quite nicely, and I will share them here.<span id="more-1114"></span></p>
<p><strong>Warning!</strong></p>
<p>These scripts use <strong>jhead</strong> to manipulate and <em>rewrite</em> your JPEG files!</p>
<p>Don&#8217;t be a fool. Experiment first with a safe set of throwaway JPEGs. And never use these tools on original files; always keep a backup.</p>
<p><strong>Prerequisite</strong></p>
<p>All of these scripts depend on a fine piece of free software called <a href="http://www.sentex.net/~mwandel/jhead/">jhead</a> written by <a href="http://www.sentex.net/~mwandel/index.html">Matthias Wandel</a>.</p>
<p>Installation is not difficult, but it does involve the command line.</p>
<ul>
<li>Go to the <a href="http://www.sentex.net/~mwandel/jhead/">jhead site</a></li>
<li>Scroll down to the <strong>Releases</strong> section and look for <strong>Pre-built OS-X Intel executable</strong></li>
<li>Right click on the <strong>jhead</strong> link on that row and choose <strong>Save Linked File to &#8220;Downloads&#8221;</strong></li>
</ul>
<p>At this point, I found that <strong>jhead</strong> was saved as <strong>jhead.txt</strong>. Oh well. We needed to do some command-line magic anyway.</p>
<ul>
<li>Open a terminal window and enter the following:</li>
</ul>
<div class="codecolorer-container bash default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:435px;"><table cellspacing="0" cellpadding="0"><tbody><tr><td style="padding:5px;text-align:center;color:#888888;background-color:#EEEEEE;border-right: 1px solid #9F9F9F;font: normal 12px/1.4em Monaco, Lucida Console, monospace;"><div>1<br />2<br />3<br />4<br /></div></td><td><div class="bash codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">$ <span style="color: #7a0874; font-weight: bold;">cd</span> ~<span style="color: #000000; font-weight: bold;">/</span>Downloads<br />
$ <span style="color: #c20cb9; font-weight: bold;">mv</span> jhead.txt jhead<br />
$ <span style="color: #c20cb9; font-weight: bold;">chmod</span> <span style="color: #000000;">777</span> jhead<br />
$ <span style="color: #c20cb9; font-weight: bold;">sudo</span> <span style="color: #c20cb9; font-weight: bold;">mv</span> jhead <span style="color: #000000; font-weight: bold;">/</span>usr<span style="color: #000000; font-weight: bold;">/</span>local<span style="color: #000000; font-weight: bold;">/</span>bin</div></td></tr></tbody></table></div>
<p>These lines do the following:</p>
<ul>
<li>Lines 1 and 2 navigate to the Downloads directory and remove the &#8220;.txt&#8221; from the name</li>
<li>Line 3 makes the file executable by everyone on your Mac</li>
<li>Line 4 places the file in a public area where everyone on your Mac can see it (you will be prompted for your password)</li>
</ul>
<p><strong>Stripping All EXIF Data</strong></p>
<p>Sometimes I receive files that have corrupt EXIF data. I had a large quantity of scanned files in my collection that claimed to be scanned some time in 2038, while others insisted that they had been around since 1901. Neither situation is good, and I found that standard EXIF editing tools may fail to change these corrupt EXIF sections.</p>
<p>The answer is to blow away the EXIF data.</p>
<p>Download: <a href="http://paperjammed.com/wp-content/uploads/2011/02/Strip-EXIF.zip">Strip EXIF</a></p>
<p>This zip file contains a compiled AppleScript application. You can unzip it and place the application on your desktop. Safari will probably unzip it for you when you click the link.</p>
<p>To be safe, open AppleScript Editor and use it to open the <strong>Strip EXIF</strong> app to see its magic.</p>
<p>Now you can drop any number of JPEG files onto the <strong>Strip EXIF</strong> app and it will kindly eviscerate each JPEG, removing all traces of EXIF data.</p>
<p><strong>Adding Basic EXIF Data to a vanilla JPEG</strong></p>
<p>Some tools create JPEG files without EXIF date and time information within. This is typically the hallmark of photo manipulation software and dodgy scanner software. And if you happened to use the <strong>Strip EXIF</strong> app to rip out a bad EXIF block, then you will want to replace it with a proper data block so that you can still use camera date and timestamps.</p>
<p>Download: <a href="http://paperjammed.com/wp-content/uploads/2011/02/Add-Basic-EXIF.zip">Add Basic EXIF</a></p>
<p>Again, unzip the file, place the app on your desktop, and then drop any number of JPEG files onto <strong>Add Basic EXIF</strong>.</p>
<p>The app will set the EXIF date to the file creation timestamp.</p>
<p><strong>Spreading EXIF Timestamps</strong></p>
<p>This is the real reason why I wrote these scripts. I couldn&#8217;t find a satisfactory tool on the market that would allow me to automatically spread out the shooting times for a series of images.</p>
<p>Why would anyone want to do this? Because some processes give you fifty JPEG files all with the exact same creation time and exact same shooting time. I like to use file renaming tools to incorporate the shooting time in the filename, so that files sort by chronological order. This doesn&#8217;t work if all of the timestamps are the same.</p>
<p>So I wrote a little app that adjusts the first photo by one minute, the second by two minutes, and so on. If there are fifty photos, then the last one will have its shooting time adjusted by fifty minutes.</p>
<p>The result is a series of photos/scans that have different timestamps.</p>
<p>Download: <a href="http://paperjammed.com/wp-content/uploads/2011/02/Spread-EXIF-Timestamps.zip">Spread EXIF Timestamps</a></p>
<p>Again, please look at the short program before you run it.</p>
<p><strong>Closing Thoughts</strong></p>
<p>I hope that my favorite tool implements these tricks soon (<strong>A Better Finder Attributes</strong>, I&#8217;m looking at you!), but until then, I will be dropping my files onto these three little droplets.</p>
<p>The <strong>jhead</strong> tool is so versatile that I will probably end up with a whole slew of similar droplets that will do all kinds of spiffy stuff. Nevertheless, I would rather the commercial products already provided these features. Not everyone likes dipping into AppleScript and the command line!</p>
]]></content:encoded>
			<wfw:commentRss>http://paperjammed.com/2011/02/14/a-couple-of-applescript-droplets-to-tweak-exif-timestamps/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Extra Geek Points Today: Ubuntu running on Soekris</title>
		<link>http://paperjammed.com/2011/02/06/extra-geek-points-today-ubuntu-running-on-soekris/</link>
		<comments>http://paperjammed.com/2011/02/06/extra-geek-points-today-ubuntu-running-on-soekris/#comments</comments>
		<pubDate>Mon, 07 Feb 2011 03:52:17 +0000</pubDate>
		<dc:creator>Tad</dc:creator>
				<category><![CDATA[Geekery]]></category>
		<category><![CDATA[Geeky]]></category>
		<category><![CDATA[Linux]]></category>

		<guid isPermaLink="false">http://paperjammed.com/?p=1099</guid>
		<description><![CDATA[This is all about getting another stamp on my geek card, so if that&#8217;s not your thing, you might want to just move on&#8230; Anyway, some weeks ago I was thinking how cool it would be to have a totally fanless and silent little black box that would serve up something useful in my house, [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignright size-medium wp-image-1104" title="Computer Geek Happy" src="http://paperjammed.com/wp-content/uploads/2011/02/iStock_000006522132XSmall-200x300.jpg" alt="iStockPhoto" width="200" height="300" />This is all about getting another stamp on my geek card, so if that&#8217;s not your thing, you might want to just move on&#8230;</p>
<p>Anyway, some weeks ago I was thinking how cool it would be to have a totally fanless and silent little black box that would serve up something useful in my house, such as the wiki I use to keep track of my geeky pursuits.</p>
<p>The other day I managed to obtain a spare Soekris net4801 box, a device best known for embedded applications such as firewalls. Between last night and this morning, I spent about three hours working out the hitches and getting MoinMoin and Ubuntu up and running on this neat little box.<span id="more-1099"></span></p>
<p><strong>The Device</strong></p>
<p>So this is what I am talking about:</p>
<p><img class="alignnone size-medium wp-image-1101" title="20091006-soekris-net4801" src="http://paperjammed.com/wp-content/uploads/2011/02/20091006-soekris-net4801-300x115.jpg" alt="" width="300" height="115" /></p>
<p>Cute, isn&#8217;t it?</p>
<p>It&#8217;s a <a href="http://www.soekris.com/net4801.htm">Soekris net4801</a>, a single board computer, a bit smaller than a hardcover book, with a 233MHz chip and 128MB of RAM soldered on the board. It doesn&#8217;t even have a hard drive—it runs everything off of a CF card.</p>
<p><strong>The Operating System</strong></p>
<p>After searching around for various combinations, I chose Ubuntu. Many people use FreeBSD on these boxes, because it is such a lightweight Linux distro, but I immediately ran into issues during the FreeBSD install process, and it was clear that I was going to have to learn many details of this different distro if I wanted to make my own custom install.</p>
<p>The approach I followed involved creating an image using an existing Ubuntu machine—a Hardy instance (8.04LTS) that I have running in a VM on my Mac.</p>
<p><strong>The Process</strong></p>
<p>I followed two different pages for getting Ubuntu up and running:</p>
<ul>
<li><a href="http://wiki.soekris.info/Installing_Ubuntu_7.04_Server_via_debootstrap">Installing Ubuntu 7.04 Server via debootstrap</a></li>
<li><a href="http://www.swineworld.org/odds/ubuntu9.04-soekris4801.html">Ubuntu 9.04 (jaunty) on a soekris net-4801</a></li>
</ul>
<p>As it turned out, I had to keep bouncing between the two pages, using a little bit from one and then grabbing something from the other. For example, one page mentioned using <tt>parted</tt> for partitioning the CF card, while the other used <tt>cfdisk</tt>. I found <tt>cfdisk</tt> a little less intimidating, so that&#8217;s the one I used.</p>
<p>Both of these pages talk about using a tool called <a href="http://wiki.debian.org/Debootstrap">Debootstrap</a> to install a new Debian base system in a subdirectory of another.</p>
<p>By mounting the CF card on a running Ubuntu instance, and then mounting the ISO Ubuntu disk image, you use <tt>debootstrap</tt> to generate a base Ubuntu install on the CF card.</p>
<p>It ain&#8217;t simple, but it ain&#8217;t rocket science. There are several tedious details, but for the most part it is a matter of following the directions.</p>
<p>The overall process is as such:</p>
<ul>
<li>Format and partition the CF card with Linux filesystems</li>
<li>Use <tt>debootstrap</tt> to create a base Ubuntu install on the card</li>
<li>Use <tt>chroot</tt> to set the card as the root filesystem and make several tweaks, such as setting up mount points, configuring networking, and adding users</li>
<li>Get the latest kernel and set up grub</li>
</ul>
<p>Then you plug it into the Soekris and boot</p>
<p><strong>Other Hardware Issues</strong></p>
<ul>
<li>You need a CF card reader, a device which is becoming more rare as each day passes.</li>
<li>You also need a serial cable and a machine with a serial port—that&#8217;s how you get to the console on the Soekris.</li>
</ul>
<p><strong>Ubuntu Configuration</strong></p>
<p>Once I got Ubuntu up and running and was connected via the console, I did the following:</p>
<ul>
<li>Installed <tt>ssh</tt></li>
<li>Installed <tt>apache2</tt></li>
<li>Installed and configured MoinMoin</li>
</ul>
<p>Once <tt>ssh</tt> was up and running, I could abandon the serial console and do my work from a terminal window on my Mac.</p>
<p><strong>Final Outcome</strong></p>
<p>I was saddened by the results. Each page takes about seven seconds to render. This is probably a combination of these factors:</p>
<ul>
<li>My box uses a CF card, and not a hard drive. It is quite possible that the claimed 15MB/s written on the card is a lie, and I haven&#8217;t even checked to see how far that is from hard drive access times.</li>
<li>MoinMoin is filesystem intensive, and they recommend running on a system with lots of memory, to keep the files in-memory.</li>
<li>This device has only 128MB of memory, and it isn&#8217;t expandable.</li>
<li>The processor isn&#8217;t that fast: 233MHz.</li>
<li>I have done nothing to tune the Ubuntu install to make it faster.</li>
</ul>
<p>I looked at top while the pages were being generated, and it was clear that the CGI script for MoinMoin was really taking seven seconds each time.</p>
<p>I looked up MoinMoin tuning and saw that I should be using WSGI instead of CGI, but there wasn&#8217;t a ready made package for Hardy, and I didn&#8217;t want to spend the time building it, chasing after a possible second or two of performance.</p>
<p>Maybe some day I&#8217;ll buy the hard drive adapter for the Soekris and see if I can bump up the speed that way. It&#8217;s better to run a system like this on a hard drive anyway.</p>
<p>Another thing I have in mind is to try this on my PC Engines ALIX box that runs my home firewall—that&#8217;s a much faster board with more memory.</p>
<p>So, if you landed here because you were trying something similar, let me know. Share your thoughts and successes or failures.</p>
<p><strong>Update</strong></p>
<p>So I had some spare time today and I decided to take another whack at loading some of the other variants of Unix/Linux onto the device, hopefully finding something lightweight with a smaller memory footprint.</p>
<p>The mysteries of <a href="http://en.wikipedia.org/wiki/Preboot_Execution_Environment"><strong>PXE</strong></a> were slowly made clear to me—if you read a the Wikipedia page, it looks fairly complex with lots of configuration on the various bits. The reality is that PXE involves a few simple things:</p>
<ul>
<li>Local DHCP server tweaks to point the way to the PXE server</li>
<li>A <strong><a href="http://en.wikipedia.org/wiki/Trivial_File_Transfer_Protocol">tftp</a></strong> server—meaning Trivial File Transfer Protocol</li>
<li>The actual PXE boot payload file (typically a bootstrap file along with a config file)</li>
<li>Some kind of file server, such as HTTP or NFS, to host the full install directory tree from the ISO of choice (FreeBSD, Ubuntu, etc)</li>
</ul>
<p>Any decent geek who messes around with Linux boxes probably already has a few of these pieces running somewhere.</p>
<p>The reason it seems so darned complicated is that most online solutions do all of this in one shot on one machine, building a DHCP server as well as an NFS server, talking about this mysterious <tt>tftp</tt> along with <tt>inetd</tt> (&#8220;Trivial FTP&#8221; sounds so much more manageable, doesn&#8217;t it?).</p>
<p>Upon realizing that most of the scary stuff was all about getting <tt>tftp</tt> running, I followed this fine set of instructions on <a href="http://www.davidsudjiman.info/2006/03/27/installing-and-setting-tftpd-in-ubuntu/">Installing and setting TFTPD in Ubuntu</a> and had my Ubuntu <tt>tftp</tt> server running in about five minutes.</p>
<p>I was also able to relatively quickly add two single entries in my m0n0wall firewall configuration (my DHCP server) so that it would redirect local PXE requests correctly. This involved two &#8220;hidden options&#8221; from this m0n0wall page: <a href="http://doc.m0n0.ch/handbook/faq-hiddenopts.html">What about hidden config.xml options?</a></p>
<p>And the web server? I already had an Apache server installed on my Ubuntu box, so that was no problem whatsoever.</p>
<p>Once you separate it out into its pieces, it is a pretty quick setup.</p>
<p>The hardest part is setting up the PXE payload. I was able to get Soekris to boot a couple of different boot kernels, but at this point I realized that any more progress would require custom boot loaders and such, compiled on a FreeBSD machine, and this was more than I wanted to get into.</p>
<p>So I reimaged the borrowed Soekris as a m0n0wall firewall, as it had been before, and restored the original configuration file. In a few days it will be back in its home, doing a proper job of being a firewall.</p>
]]></content:encoded>
			<wfw:commentRss>http://paperjammed.com/2011/02/06/extra-geek-points-today-ubuntu-running-on-soekris/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>New life for an old PC—no geek card required</title>
		<link>http://paperjammed.com/2010/05/05/new-life-for-an-old-pc%e2%80%94no-geek-card-required/</link>
		<comments>http://paperjammed.com/2010/05/05/new-life-for-an-old-pc%e2%80%94no-geek-card-required/#comments</comments>
		<pubDate>Thu, 06 May 2010 01:52:22 +0000</pubDate>
		<dc:creator>Tad</dc:creator>
				<category><![CDATA[Paperless Life]]></category>
		<category><![CDATA[Software]]></category>
		<category><![CDATA[Backups]]></category>
		<category><![CDATA[Data Loss]]></category>
		<category><![CDATA[Geeky]]></category>
		<category><![CDATA[Good Sites]]></category>
		<category><![CDATA[Hardware]]></category>
		<category><![CDATA[Knowledge Management]]></category>
		<category><![CDATA[Networking]]></category>
		<category><![CDATA[Reviews]]></category>
		<category><![CDATA[Tips]]></category>

		<guid isPermaLink="false">http://paperjammed.com/?p=985</guid>
		<description><![CDATA[Do you still have an old machine kicking around in the basement or the back room, long forgotten? For no cost and almost zero effort, you can set it up as a dedicated network appliance, using one of the many turnkey products from the open-source TurnKey Linux project. I&#8217;m serious. You don&#8217;t need to know [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignright size-medium wp-image-986" src="http://paperjammed.com/wp-content/uploads/2010/05/iStock_000004973496XSmall-200x300.jpg" alt="istockphoto.com" width="200" height="300" />Do you still have an old machine kicking around in the basement or the back room, long forgotten?<br />
For no cost and almost zero effort, you can set it up as a dedicated network appliance, using one of the many turnkey products from the open-source TurnKey Linux project.</p>
<p>I&#8217;m serious. You don&#8217;t need to know anything at all about Linux to use one of these. Just download the image, install, and you suddenly have a full featured NAS file server, or you might have a database or a source code repository.</p>
<p>Last year I wrote an article on <a href="http://paperjammed.com/2009/02/15/new-life-for-an-old-clunker/">how to set up a NAS device using Ubuntu Linux</a>. I have been a fan of Ubuntu since the start because it is a very easy distribution to install and configure. The down-side of using Linux has always been the fairly steep learning curve. Before you can get around to using the server, you need to get down in the weeds with configuration files and other stuff.</p>
<p>TurnKey Linux changes all of that.<span id="more-985"></span></p>
<p><strong>Painless Installation</strong></p>
<p>A few weeks back, I was setting up an aging PC as a standalone wiki server for a small office—this machine was going to provide a place for the office staff to document their procedures, how-tos, and other things.</p>
<p>I was about to set up an Ubuntu server, as I have done before many times, and install MoinMoin, like I did <a href="http://paperjammed.com/2009/10/12/why-not-try-a-personal-wiki-for-some-of-your-more-amorphous-notes/">some months back</a>. I remembered that it was a bit of a pain to get everything tweaked just right, so I did a quick check to see what kind of standalone wiki options were available online.</p>
<p>This is how I found TurnKey Linux. This project is all about single-purpose preconfigured Ubuntu server images.</p>
<p>One of those preconfigured images happens to be a <a href="http://www.turnkeylinux.org/mediawiki">MediaWiki appliance</a>—the wiki engine behind Wikipedia—and I was in business.</p>
<p>The installation took about fifteen minutes, with very little user interaction. I answered a few basic questions and the installer took over from there. As soon as the install was done, the machine rebooted and displayed a message on the monitor with the IP addresses where you can browse to from any other machine.</p>
<p><strong>Full Featured</strong></p>
<p>The work that has gone in to these appliances is amazing. In fifteen minutes I had installed a complex configuration that has the Apache, PHP, MySQL, MediaWiki core, as well as maintenance utilities such as a neat tool that provides a <span style="text-decoration: line-through;">Flash-based</span> pure-AJAX-based SSH command line in a remote browser (i.e. your browser becomes a terminal). Even someone with Linux experience would have to spend quite a bit of time fiddling around with different packages and configuration options in other to provide the same functionality that TurnKey gives you out of the box.</p>
<p>As with most open source projects, the documentation is about 80% complete, with deep detail in some areas, but leaving others fairly sparsely documented. But don&#8217;t let this deter you: in most cases users know how to use the product they are installing (e.g. MediaWiki) but don&#8217;t want the hassle of configuring it on Linux. That&#8217;s where TurnKey shines.</p>
<p><strong>Some Examples</strong></p>
<p>In minutes, you can set up a <a href="http://www.turnkeylinux.org/fileserver">NAS device</a>. If you want to try advanced content management in your office, try <a href="http://www.turnkeylinux.org/joomla">Joomla</a> or <a href="http://www.turnkeylinux.org/drupal6">Drupal</a>.</p>
<p>If you are working on a small project team and want to protect your source code, try <a href="http://www.turnkeylinux.org/redmine">Redmine</a> or <a href="http://www.turnkeylinux.org/trac">Trac</a> and do your bug tracking using <a href="http://www.turnkeylinux.org/bugzilla">Bugzilla</a>.</p>
<p>And while you are at it, you can document your organization&#8217;s working practices using a wiki such as <a href="http://www.turnkeylinux.org/moinmoin">MoinMoin</a> or <a href="http://www.turnkeylinux.org/mediawiki">MediaWiki</a>.</p>
<p><strong>Don&#8217;t forget to back it up!</strong></p>
<p>As with any computer, you should include your new TurnKey appliance in your backup strategy. The nice thing is that you don&#8217;t really need to care at all about backing up Linux or the other software; just back up the data. I don&#8217;t need to back up my entire MediaWiki machine; I just need to back up the database and image files. If anything goes wrong, you can rebuild the TurnKey appliance from scratch in minutes and then restore your data.</p>
<p>To save yourself some pain, keep notes on any small tweaks you made to the configuration.</p>
<p><strong>One Machine, One Purpose</strong></p>
<p>These disk images share common Ubuntu underpinnings, but they are referred to as Appliances because they turn your PC into a purpose-built appliance.</p>
<p>This means that if you want a content management system and you also want a ticket management system, you will need two old computers—not a rare commodity these days.</p>
<p>Take a look at <a href="http://www.turnkeylinux.org/">what they have to offer</a> and give TurnKey a shot—specialized software used in corporate environments is now within reach of small offices at the right price.</p>
]]></content:encoded>
			<wfw:commentRss>http://paperjammed.com/2010/05/05/new-life-for-an-old-pc%e2%80%94no-geek-card-required/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>A handful of sweet freebie tools to save the day</title>
		<link>http://paperjammed.com/2010/03/16/a-handful-of-sweet-freebie-tools-to-save-the-day/</link>
		<comments>http://paperjammed.com/2010/03/16/a-handful-of-sweet-freebie-tools-to-save-the-day/#comments</comments>
		<pubDate>Wed, 17 Mar 2010 03:31:14 +0000</pubDate>
		<dc:creator>Tad</dc:creator>
				<category><![CDATA[Searching and Indexing]]></category>
		<category><![CDATA[Software]]></category>
		<category><![CDATA[Workflow]]></category>
		<category><![CDATA[Geeky]]></category>
		<category><![CDATA[Macros]]></category>
		<category><![CDATA[PDF]]></category>
		<category><![CDATA[Scripting]]></category>
		<category><![CDATA[Tips]]></category>
		<category><![CDATA[Windows]]></category>

		<guid isPermaLink="false">http://paperjammed.com/?p=930</guid>
		<description><![CDATA[It so happens that my employer has made a most welcome decision to replace the aging creaky old Novell GroupWise mail software with Microsoft Outlook, joining the rest of the modern corporate world. Now, there is little love in my heart for GroupWise, but it does have one feature that the new Outlook configuration will [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignright size-medium wp-image-935" title="iStock_000000846660XSmall" src="http://paperjammed.com/wp-content/uploads/2010/03/iStock_000000846660XSmall-300x199.jpg" alt="" width="300" height="199" />It so happens that my employer has made a most welcome decision to replace the aging creaky old Novell GroupWise mail software with Microsoft Outlook, joining the rest of the modern corporate world. Now, there is little love in my heart for GroupWise, but it does have one feature that the new Outlook configuration will lack: you can keep as many emails as you want, just like Gmail.</p>
<p>The problem is this: with Outlook we will be limited to 1000 messages in our in-box; sadly, many of us have tens of thousands of emails in our old GroupWise mail. Even after a fairly rigorous slash and burn mission, hacking out all of the low hanging fruit, there will be many thousands remaining and I don&#8217;t want to lose that information. It might be useful to search and find how I set up a Zebra bar code printer in 2003, no?</p>
<p>A bundle of different freeware glue tools came to my rescue. Read on to hear about the toolset that has made it so I can keep those messages for years to come.<span id="more-930"></span></p>
<p><strong>Possible Solutions</strong></p>
<p>Right out of the gate, I began looking for ways to migrate messages from one mail client to the other. Some apps have this built right in, and if not, there are scripts and utilities out there to do this; but I was hampered by a few key facts:</p>
<ul>
<li>I have no control over the email clients and their configuration. Even if there is a menu option for exporting GroupWise messages from version 7.2, I&#8217;m stuck at 6.4 and cannot use that option.</li>
<li>GroupWise is a minor player in the email world. I&#8217;m not sure if Outlook would import from GroupWise, but I doubt it.</li>
<li>They are <em>replacing</em> the client in one shot. There will be no interim period where both GroupWise and Outlook will be available.</li>
<li>There is no getting around the hard limit of 1000 messages.</li>
<li>I don&#8217;t want to spend money on this.</li>
</ul>
<p>With these constraints in mind, I immediately thought about PDF documents. I then considered the following questions:</p>
<ul>
<li>How do I convert my email to PDF?</li>
<li>How can I do this automatically with thousands of emails?</li>
<li>Once I&#8217;m done, how do I search these documents?</li>
</ul>
<p>Here&#8217;s what I did:</p>
<p><strong>Conversion to PDF</strong></p>
<p>The first part was easy. I downloaded one of the many free print-to-PDF products available.</p>
<p>I chose <a href="http://sourceforge.net/projects/pdfcreator/">PDFCreator</a>, because I am familiar with its use and I know that it <a href="http://paperjammed.com/2009/10/27/dodged-the-corrupt-document-bullet-this-time-just-barely/">does not munge the fonts</a>.</p>
<p>Like many other PDF generation utilities, PDFCreator functions by providing a virtual printer to which any application can print. For example, to make a PDF of a web page, you use the Firefox <strong>Print</strong> menu and select <strong>PDFCreator</strong> from the drop-down list of available printers.</p>
<p>You are provided with a list of metadata fields that you can fill in, and these fields are used in the PDF generation.</p>
<p>Here&#8217;s what the PDFCreator screen looks like:</p>
<p><img class="alignnone size-full wp-image-931" title="20100316-pdfcreator1" src="http://paperjammed.com/wp-content/uploads/2010/03/20100316-pdfcreator1.gif" alt="" width="500" height="367" /></p>
<p><strong>A word of caution:</strong> PDF Creator is free, but you must be careful to deselect their spammy toolbar options in two different places during the installation process. I don&#8217;t like software that comes with preselected toolbars to install (even nice ones like Google&#8217;s) because I&#8217;m certain that 95% of the folks who actually install the toolbar would never have chosen to do so if it were unchecked by default.</p>
<p><strong>Running Everything Automatically</strong></p>
<p>This was the interesting bit. I work with Windows machines at work, so there was no AppleScript option available. So I did the next best thing: I used <a href="http://www.autoitscript.com/autoit3/index.shtml">AutoIT</a>.</p>
<p>I will warn you that AutoIT is pretty much the Windows analog of AppleScript, without the cutesy pseudo English syntax. In other words, you will need to roll up your sleeves and get your hands a little dirty in order to put together a decent AutoIT script.</p>
<p>The payoff comes when you finish your work and compile it into a tight executable that you can share with your friends, allowing them to automate some complex series of button clicks and copy/paste operations.</p>
<p>I walked through the manual process of exporting an email to PDF and listed each action:</p>
<ul>
<li>Get the date, sender, and subject</li>
<li>Create a filename based on date + sender + subject</li>
<li>Launch the <strong>Print</strong> dialog</li>
<li>Select <strong>PDFCreator</strong></li>
<li>Fill in the <strong>Document Title</strong>, <strong>Creation Date</strong>, and <strong>Subject</strong> in the PDFCreator dialog</li>
<li>Fill in the full file path in the Save dialog</li>
</ul>
<p>In addition, I wanted to make the script a little better by adding the following:</p>
<ul>
<li>Check that user has PDFCreator installed</li>
<li>Verify that GroupWise is running and that the user has selected one or more messages</li>
<li>Prompt the user for a target directory before processing the messages</li>
<li>Sanitize the filenames by replacing illegal characters with underscores and truncating to meet maximum filename and path length in Windows</li>
<li>Skip over files that have already been generated, quickly, so that one doesn&#8217;t need to worry about accidentally selecting messages that were already printed</li>
</ul>
<p>There were other adjustments needed, but the process was the same: run the script, hit a problem, tweak the script a little to address the problem, and repeat.</p>
<p>Here&#8217;s a little bit of the AutoIT script:</p>
<p><img class="size-full wp-image-943 alignnone" title="20100316-autoit" src="http://paperjammed.com/wp-content/uploads/2010/03/20100316-autoit.gif" alt="" width="500" height="345" /></p>
<p>You can see that it is a bit more intense than AppleScript, but remember that the full script wasn&#8217;t written in one go. I had a little short ten-line script that I kept tweaking as small problems cropped up until I had adjusted things to my liking.</p>
<p>Note that this is a GUI macro language. The machine starts clicking and typing away right in front of you and you probably shouldn&#8217;t interfere until your script finishes.</p>
<p>As of this afternoon, I have generated around 4,000 PDF documents for my email messages.</p>
<p><strong>Searching All of Those Documents</strong></p>
<p>This was the easiest part. These days there is an excellent tool available for searching documents on your desktop: <a href="http://desktop.google.com/">Google Desktop</a>. This product indexes every useful file on your desktop and provides a full Google search with a quick double-tap of the &lt;control&gt; key.</p>
<p>So you can enter a search like &#8220;Zebra bar code&#8221;</p>
<p><img class="alignnone size-full wp-image-944" title="20100316-google1" src="http://paperjammed.com/wp-content/uploads/2010/03/20100316-google1.gif" alt="" width="300" height="205" /></p>
<p>And the results look exactly like a Google web search, but it&#8217;s showing your desktop files. And you can see inline previews too.</p>
<p><img class="alignnone size-full wp-image-945" title="20100316-google2" src="http://paperjammed.com/wp-content/uploads/2010/03/20100316-google2.gif" alt="" width="500" height="443" /></p>
<p>Macintosh users can install Google Desktop as well, but all of these files should already be indexed and searchable by Spotlight.</p>
<p><strong>Closing Thoughts</strong></p>
<p>Whenever I reach for tools like this I feel a twinge of guilt—it&#8217;s outright hackery, isn&#8217;t it?</p>
<p>But there is a place for quick and dirty jobs in every workplace. I needed to get my files from one place to another, one time only. It just didn&#8217;t make sense to spend money or time on a more elegant solution.</p>
<p>Play around with each of these tools a little. Especially AutoIT—it&#8217;s a handy Swiss Army Knife to have at your disposal.</p>
]]></content:encoded>
			<wfw:commentRss>http://paperjammed.com/2010/03/16/a-handful-of-sweet-freebie-tools-to-save-the-day/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Automate ScanSnap OCR process on your Mac with AppleScript (Snow Leopard Edition)</title>
		<link>http://paperjammed.com/2010/01/04/automate-scansnap-ocr-process-on-your-mac-with-applescript-snow-leopard-edition/</link>
		<comments>http://paperjammed.com/2010/01/04/automate-scansnap-ocr-process-on-your-mac-with-applescript-snow-leopard-edition/#comments</comments>
		<pubDate>Tue, 05 Jan 2010 01:51:52 +0000</pubDate>
		<dc:creator>Tad</dc:creator>
				<category><![CDATA[Software]]></category>
		<category><![CDATA[Workflow]]></category>
		<category><![CDATA[Geeky]]></category>
		<category><![CDATA[PDF]]></category>
		<category><![CDATA[Scanning]]></category>
		<category><![CDATA[Scripting]]></category>
		<category><![CDATA[Searching and Indexing]]></category>

		<guid isPermaLink="false">http://paperjammed.com/?p=840</guid>
		<description><![CDATA[Some time back I published an AppleScript that allows one to automatically run OCR in the background on scanned files generated by your Fujitsu ScanSnap, while you to continue scanning more files. ScanSnap owners should all be familiar with this: the out-of-the-box configuration of the ScanSnap Manager and Abbyy Finereader force the scan and OCR [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://paperjammed.com/wp-content/uploads/2009/08/20090829-applescript.gif"><img class="alignright size-full wp-image-658" title="20090829-applescript" src="http://paperjammed.com/wp-content/uploads/2009/08/20090829-applescript.gif" alt="" width="128" height="128" /></a>Some time back I published an AppleScript that allows one to <a href="http://paperjammed.com/2009/08/29/automate-scansnap-ocr-process-on-your-mac-with-applescript/">automatically run OCR in the background on scanned files</a> generated by your Fujitsu ScanSnap, while you to continue scanning more files. ScanSnap owners should all be familiar with this: the out-of-the-box configuration of the ScanSnap Manager and Abbyy Finereader force the scan and OCR stages to run in lockstep: scan 1&#8230;OCR 1&#8230;scan 2&#8230;OCR 2&#8230; and so on. This script allowed you to scan regardless of the OCR processing going on.</p>
<p>As it turns out, my original script does not work in Snow Leopard, and I promised that I would one day clean up and publish my new and improved version.</p>
<p>Chris posted a comment today as a gentle reminder, so here is the new and improved version without further delay&#8230;<br />
<span id="more-840"></span><br />
<strong>The Details</strong></p>
<p>Unfortunately, Snow Leopard came around <a href="http://paperjammed.com/2009/09/07/when-migrating-to-a-new-operating-system-look-before-you-leap/">and caused some indigestion</a>. For starters, the ScanSnap Manager didn&#8217;t work correctly and Abbyy Finereader would not process anything made by the ScanSnap. A couple of months later <a href="http://paperjammed.com/2009/11/13/snow-leopard-update-for-scansnap/">they got everything straightened out</a> and delivered <a href="http://www.fujitsu.com/us/services/computing/peripherals/scanners/support/sl_download.html">new versions of each product</a>.</p>
<p>The new version of the Abbyy Finereader product does not play well with my original script.</p>
<p>Since I cannot do without this important functionality, I rolled up my sleeves and rewrote most of the script. The new version works in Snow Leopard quite nicely with one small annoyance: you really don&#8217;t want to try to use the machine for anything other than scanning or OCR while it is going because the new Finereader version keeps bouncing the darned icon all the time it is running and that is quite annoying to watch.</p>
<p>Fortunately, I really don&#8217;t need to use my machine for anything else while it is chewing on the docs; I just wanted to be able to continue scanning at the same time!</p>
<p><strong>Note: </strong>Before going forward, note that you will need to upgrade the ScanSnap Manager and Abbyy Finereader to the Snow Leopard versions first! Get the files <a href="http://www.fujitsu.com/us/services/computing/peripherals/scanners/support/sl_download.html">here</a>.</p>
<p>Here is a link to the <a href="http://paperjammed.com/wp-content/uploads/2010/01/Run-OCR-on-New-Folder-Items.scpt">new script</a>&#8230;</p>
<p>And here&#8217;s the code itself:</p>
<div class="codecolorer-container applescript default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:435px;height:300px;"><table cellspacing="0" cellpadding="0"><tbody><tr><td style="padding:5px;text-align:center;color:#888888;background-color:#EEEEEE;border-right: 1px solid #9F9F9F;font: normal 12px/1.4em Monaco, Lucida Console, monospace;"><div>1<br />2<br />3<br />4<br />5<br />6<br />7<br />8<br />9<br />10<br />11<br />12<br />13<br />14<br />15<br />16<br />17<br />18<br />19<br />20<br />21<br />22<br />23<br />24<br />25<br />26<br />27<br />28<br />29<br />30<br />31<br />32<br />33<br />34<br />35<br />36<br />37<br />38<br />39<br />40<br />41<br />42<br />43<br />44<br />45<br />46<br />47<br />48<br />49<br />50<br />51<br />52<br />53<br />54<br />55<br />56<br />57<br />58<br />59<br />60<br />61<br />62<br />63<br />64<br />65<br />66<br />67<br />68<br />69<br />70<br />71<br />72<br />73<br />74<br />75<br />76<br />77<br />78<br />79<br />80<br />81<br />82<br />83<br />84<br />85<br />86<br />87<br />88<br />89<br />90<br />91<br />92<br />93<br />94<br />95<br />96<br />97<br />98<br />99<br />100<br />101<br />102<br />103<br />104<br />105<br />106<br />107<br />108<br />109<br />110<br />111<br />112<br />113<br />114<br />115<br />116<br />117<br />118<br />119<br />120<br />121<br />122<br />123<br />124<br />125<br />126<br />127<br />128<br />129<br />130<br />131<br />132<br />133<br />134<br />135<br />136<br />137<br />138<br />139<br />140<br />141<br />142<br />143<br />144<br />145<br />146<br />147<br />148<br />149<br />150<br />151<br />152<br />153<br />154<br />155<br />156<br />157<br />158<br />159<br />160<br />161<br />162<br />163<br />164<br />165<br />166<br />167<br />168<br />169<br />170<br />171<br />172<br />173<br />174<br />175<br />176<br />177<br />178<br />179<br />180<br />181<br />182<br />183<br />184<br />185<br />186<br />187<br />188<br />189<br />190<br />191<br />192<br />193<br />194<br />195<br />196<br />197<br />198<br />199<br />200<br />201<br />202<br />203<br />204<br />205<br />206<br />207<br />208<br />209<br />210<br />211<br />212<br />213<br />214<br />215<br />216<br />217<br />218<br />219<br />220<br />221<br />222<br />223<br />224<br />225<br />226<br />227<br />228<br />229<br />230<br />231<br />232<br />233<br />234<br /></div></td><td><div class="applescript codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap"><span style="color: #808080; font-style: italic;">(*<br />
<br />
NOTE: This script was written for Snow Leopard. It may work<br />
on Leopard, but I never tried it.<br />
<br />
This is a folder listener script that will act as a queue, receiving<br />
PDF files from the ScanSnap scanner and feeding them, one by one, to<br />
the Abbyy FineReader OCR software.<br />
<br />
This allows you to keep scanning while the OCR job runs in the background<br />
on all of the unprocessed files.<br />
<br />
Why do we want to do this?<br />
<br />
The ScanSnap Manager software does not support this by default, so<br />
when you scan in a file, it sends it to FineReader for OCR. You then<br />
must wait until FineReader finishes its work before scanning in another<br />
document.<br />
<br />
This script allows you to keep scanning without waiting for OCR.<br />
<br />
Installation:<br />
<br />
o &nbsp; Copy this script to:<br />
<br />
&nbsp; &nbsp; &lt;home&gt;/Library/Scripts/Folder Action Scripts<br />
<br />
&nbsp; &nbsp; You may have to create the &quot;Folder Action Scripts&quot; folder.<br />
<br />
o &nbsp; Open a Finder window and navigate to the parent folder<br />
&nbsp; of the scanned documents folder.<br />
<br />
o Right click (control-click) the scanned documents folder and<br />
&nbsp; choose:<br />
<br />
&nbsp; &nbsp; Folder Actions Setup...<br />
<br />
o At this point if folder actions are not enabled, you will<br />
&nbsp; likely have to enable them and add the script manually.<br />
&nbsp; &nbsp; - check &quot;Enable Folder Actions&quot;<br />
&nbsp; &nbsp; - Use the &quot;+&quot; buttons on the left and right sides to add the<br />
&nbsp; &nbsp; &nbsp; scan folder and then this script.<br />
&nbsp; &nbsp; <br />
o Otherwise, a list of scripts will come up. Choose this script<br />
&nbsp; from the &quot;Choose a Script to Attach&quot; dialog.<br />
<br />
o Close all windows.<br />
<br />
Copyright (C) 2010 Tad Harrison<br />
*)</span><br />
<span style="color: #ff0033; font-weight: bold;">property</span> ocrFileSuffix : <span style="color: #009900;">&quot; processed by FineReader.pdf&quot;</span><br />
<span style="color: #ff0033; font-weight: bold;">property</span> ocrApplicationName : <span style="color: #009900;">&quot;Scan to Searchable PDF&quot;</span><br />
<span style="color: #ff0033; font-weight: bold;">property</span> ocrApplicationWindow : <span style="color: #009900;">&quot;Converting the document&quot;</span><br />
<span style="color: #ff0033; font-weight: bold;">property</span> ocrLockFileName : <span style="color: #009900;">&quot;OCR in Progress&quot;</span><br />
<span style="color: #ff0033; font-weight: bold;">on</span> <span style="color: #0066ff;">adding</span> <span style="color: #0066ff;">folder</span> <span style="color: #0066ff;">items</span> <span style="color: #ff0033; font-weight: bold;">to</span> this_folder <span style="color: #ff0033;">after</span> <span style="color: #0066ff;">receiving</span> added_items<br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> lockFilePath <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #000000;">&#40;</span><span style="color: #0066ff;">POSIX path</span> <span style="color: #ff0033; font-weight: bold;">of</span> <span style="color: #000000;">&#40;</span><span style="color: #0066ff;">path to</span> <span style="color: #0066ff;">desktop</span> <span style="color: #0066ff;">folder</span> <span style="color: #ff0033;">as</span> <span style="color: #0066ff;">text</span><span style="color: #000000;">&#41;</span><span style="color: #000000;">&#41;</span> <span style="color: #000000;">&amp;</span> ocrLockFileName<br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">try</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; logEvent<span style="color: #000000;">&#40;</span><span style="color: #009900;">&quot;=== Run OCR on New Folder Items ===&quot;</span><span style="color: #000000;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;">-- Test for lockfile; exit if lockfile exists</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">tell</span> <span style="color: #0066ff;">application</span> <span style="color: #009900;">&quot;System Events&quot;</span> <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #ff0033; font-weight: bold;">set</span> lockFileExists <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #0066ff;">exists</span> <span style="color: #0066ff;">file</span> lockFilePath<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">if</span> lockFileExists <span style="color: #ff0033; font-weight: bold;">then</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; logEvent<span style="color: #000000;">&#40;</span><span style="color: #009900;">&quot;Other script running. Exiting...&quot;</span><span style="color: #000000;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">return</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">else</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #0066ff;">do shell script</span> <span style="color: #009900;">&quot;/usr/bin/touch <span style="color: #000000; font-weight: bold;">\&quot;</span>&quot;</span> <span style="color: #000000;">&amp;</span> lockFilePath <span style="color: #000000;">&amp;</span> <span style="color: #009900;">&quot;<span style="color: #000000; font-weight: bold;">\&quot;</span>&quot;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">end</span> <span style="color: #ff0033; font-weight: bold;">if</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;">-- Main loop</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> moreWorkToDo <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #0066ff;">true</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">repeat</span> <span style="color: #ff0033; font-weight: bold;">while</span> moreWorkToDo<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> aFile <span style="color: #ff0033; font-weight: bold;">to</span> getNextFile<span style="color: #000000;">&#40;</span>this_folder<span style="color: #000000;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">if</span> <span style="color: #ff0033;">not</span> aFile <span style="color: #000000;">=</span> <span style="color: #009900;">&quot;&quot;</span> <span style="color: #ff0033; font-weight: bold;">then</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; ocrFile<span style="color: #000000;">&#40;</span>aFile<span style="color: #000000;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">else</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> moreWorkToDo <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #0066ff;">false</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">end</span> <span style="color: #ff0033; font-weight: bold;">if</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">end</span> <span style="color: #ff0033; font-weight: bold;">repeat</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; logEvent<span style="color: #000000;">&#40;</span><span style="color: #009900;">&quot;No more work.&quot;</span><span style="color: #000000;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; exitApp<span style="color: #000000;">&#40;</span>ocrApplicationName<span style="color: #000000;">&#41;</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">on</span> <span style="color: #ff0033; font-weight: bold;">error</span> errorStr <span style="color: #0066ff;">number</span> errNum<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #0066ff;">display dialog</span> <span style="color: #009900;">&quot;Error &quot;</span> <span style="color: #000000;">&amp;</span> errNum <span style="color: #000000;">&amp;</span> <span style="color: #009900;">&quot; while running OCR: &quot;</span> <span style="color: #000000;">&amp;</span> errorStr<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> <span style="color: #ff0033; font-weight: bold;">my</span> isRunning <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #0066ff;">false</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">end</span> <span style="color: #ff0033; font-weight: bold;">try</span><br />
&nbsp; &nbsp; <span style="color: #808080; font-style: italic;">-- Get rid of the lockfile, ignoring any errors</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">try</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #0066ff;">do shell script</span> <span style="color: #009900;">&quot;/bin/rm <span style="color: #000000; font-weight: bold;">\&quot;</span>&quot;</span> <span style="color: #000000;">&amp;</span> lockFilePath <span style="color: #000000;">&amp;</span> <span style="color: #009900;">&quot;<span style="color: #000000; font-weight: bold;">\&quot;</span>&quot;</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">end</span> <span style="color: #ff0033; font-weight: bold;">try</span><br />
<span style="color: #ff0033; font-weight: bold;">end</span> <span style="color: #0066ff;">adding</span> <span style="color: #0066ff;">folder</span> <span style="color: #0066ff;">items</span> <span style="color: #ff0033; font-weight: bold;">to</span><br />
<span style="color: #808080; font-style: italic;">(*<br />
Name: ocrFile<br />
Description: Runs OCR on the next un-OCR'd file<br />
Parameters:<br />
&nbsp; aFile - the file to be OCR'd<br />
*)</span><br />
<span style="color: #ff0033; font-weight: bold;">on</span> ocrFile<span style="color: #000000;">&#40;</span>aFile<span style="color: #000000;">&#41;</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> posixFilePath <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #0066ff;">POSIX path</span> <span style="color: #ff0033; font-weight: bold;">of</span> aFile<br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> posixOcrFilePath <span style="color: #ff0033; font-weight: bold;">to</span> getPosixOcrFilePath<span style="color: #000000;">&#40;</span>posixFilePath<span style="color: #000000;">&#41;</span><br />
&nbsp; &nbsp; logEvent<span style="color: #000000;">&#40;</span><span style="color: #009900;">&quot;OCR: &quot;</span> <span style="color: #000000;">&amp;</span> posixFilePath<span style="color: #000000;">&#41;</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">tell</span> <span style="color: #0066ff;">application</span> ocrApplicationName <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #0066ff;">open</span> aFile<br />
&nbsp; &nbsp; <span style="color: #808080; font-style: italic;">--</span><br />
&nbsp; &nbsp; <span style="color: #808080; font-style: italic;">-- Now sit in a loop checking once per second for the OCR file</span><br />
&nbsp; &nbsp; <span style="color: #808080; font-style: italic;">-- Give up after five minutes</span><br />
&nbsp; &nbsp; <span style="color: #808080; font-style: italic;">--</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">with</span> <span style="color: #ff0033; font-weight: bold;">timeout</span> <span style="color: #ff0033; font-weight: bold;">of</span> <span style="color: #000000;">300</span> seconds<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> ocrFileExists <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #0066ff;">false</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">repeat</span> <span style="color: #ff0033; font-weight: bold;">until</span> ocrFileExists<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> ocrFileExists <span style="color: #ff0033; font-weight: bold;">to</span> posixFileExists<span style="color: #000000;">&#40;</span>posixOcrFilePath<span style="color: #000000;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">if</span> ocrFileExists <span style="color: #ff0033; font-weight: bold;">then</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; logEvent<span style="color: #000000;">&#40;</span><span style="color: #009900;">&quot;OCR file generated.&quot;</span><span style="color: #000000;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;">-- Wait 5 even if the file was found, to let things settle</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; delay <span style="color: #000000;">5</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">else</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;">-- Wait a second before checking again</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; delay <span style="color: #000000;">1</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">end</span> <span style="color: #ff0033; font-weight: bold;">if</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">end</span> <span style="color: #ff0033; font-weight: bold;">repeat</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">end</span> <span style="color: #ff0033; font-weight: bold;">timeout</span><br />
<span style="color: #ff0033; font-weight: bold;">end</span> ocrFile<br />
<span style="color: #808080; font-style: italic;">(*<br />
Name: appIsRunning<br />
Description: Determines if a particular application is running.<br />
Parameters:<br />
&nbsp; &nbsp; appName - the name of the application to be tested<br />
Returns: True if the application is running; otherwise False<br />
*)</span><br />
<span style="color: #ff0033; font-weight: bold;">on</span> appIsRunning<span style="color: #000000;">&#40;</span>appName<span style="color: #000000;">&#41;</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">tell</span> <span style="color: #0066ff;">application</span> <span style="color: #009900;">&quot;System Events&quot;</span> <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #000000;">&#40;</span><span style="color: #0066ff;">name</span> <span style="color: #ff0033; font-weight: bold;">of</span> processes<span style="color: #000000;">&#41;</span> <span style="color: #ff0033;">contains</span> appName<br />
<span style="color: #ff0033; font-weight: bold;">end</span> appIsRunning<br />
<span style="color: #808080; font-style: italic;">(*<br />
Name: posixFileExists<br />
Description: Determines if a particular file exists.<br />
Parameters:<br />
&nbsp; &nbsp; posixFilePath - the POSIX path to the file<br />
Returns: True if the file exists; otherwise False<br />
*)</span><br />
<span style="color: #ff0033; font-weight: bold;">on</span> posixFileExists<span style="color: #000000;">&#40;</span>posixFilePath<span style="color: #000000;">&#41;</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">tell</span> <span style="color: #0066ff;">application</span> <span style="color: #009900;">&quot;System Events&quot;</span> <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #0066ff;">exists</span> <span style="color: #0066ff;">file</span> posixFilePath<br />
<span style="color: #ff0033; font-weight: bold;">end</span> posixFileExists<br />
<span style="color: #808080; font-style: italic;">(*<br />
Name: exitApp<br />
Description: Exits the specified app if it is running.<br />
Parameters:<br />
&nbsp; &nbsp; appName - the application name<br />
*)</span><br />
<span style="color: #ff0033; font-weight: bold;">on</span> exitApp<span style="color: #000000;">&#40;</span>appName<span style="color: #000000;">&#41;</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">if</span> appIsRunning<span style="color: #000000;">&#40;</span>appName<span style="color: #000000;">&#41;</span> <span style="color: #ff0033; font-weight: bold;">then</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">tell</span> <span style="color: #0066ff;">application</span> appName <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #0066ff;">quit</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">end</span> <span style="color: #ff0033; font-weight: bold;">if</span><br />
<span style="color: #ff0033; font-weight: bold;">end</span> exitApp<br />
<span style="color: #808080; font-style: italic;">(*<br />
Name: getPosixOcrFilePath<br />
Description: Gets the OCR output filename for a given input filename.<br />
Parameters:<br />
&nbsp; &nbsp; posixFilePath - the full path to the source file<br />
Return: the POSIX path of the OCR output file<br />
*)</span><br />
<span style="color: #ff0033; font-weight: bold;">on</span> getPosixOcrFilePath<span style="color: #000000;">&#40;</span>posixFilePath<span style="color: #000000;">&#41;</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> posixBaseName <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #0066ff;">do shell script</span> ¬<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #009900;">&quot;filename=&quot;</span> <span style="color: #000000;">&amp;</span> <span style="color: #0066ff;">quoted form</span> <span style="color: #ff0033; font-weight: bold;">of</span> posixFilePath <span style="color: #000000;">&amp;</span> <span style="color: #009900;">&quot;; echo ${filename%<span style="color: #000000; font-weight: bold;">\\</span>.*}&quot;</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> posixOcrFilePath <span style="color: #ff0033; font-weight: bold;">to</span> posixBaseName <span style="color: #000000;">&amp;</span> ocrFileSuffix<br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">return</span> posixOcrFilePath<br />
<span style="color: #ff0033; font-weight: bold;">end</span> getPosixOcrFilePath<br />
<span style="color: #808080; font-style: italic;">(*<br />
Name: getNextFile<br />
Description: Finds the next unprocessed ScanSnap PDF<br />
Return: the file or &quot;&quot;<br />
*)</span><br />
<span style="color: #ff0033; font-weight: bold;">on</span> getNextFile<span style="color: #000000;">&#40;</span>aFolder<span style="color: #000000;">&#41;</span><br />
&nbsp; &nbsp; logEvent<span style="color: #000000;">&#40;</span><span style="color: #009900;">&quot;Getting next file...&quot;</span><span style="color: #000000;">&#41;</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> masterFileList <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #0066ff;">list</span> <span style="color: #0066ff;">folder</span> aFolder ¬<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">without</span> <span style="color: #0066ff;">invisibles</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> posixPath <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #0066ff;">POSIX path</span> <span style="color: #ff0033; font-weight: bold;">of</span> aFolder<br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">repeat</span> <span style="color: #ff0033; font-weight: bold;">with</span> i <span style="color: #ff0033; font-weight: bold;">from</span> <span style="color: #000000;">1</span> <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #0066ff;">count</span> masterFileList<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> fileName <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #0066ff;">item</span> i <span style="color: #ff0033; font-weight: bold;">of</span> masterFileList<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> posixFilePath <span style="color: #ff0033; font-weight: bold;">to</span> posixPath <span style="color: #000000;">&amp;</span> fileName<br />
&nbsp; &nbsp; &nbsp; &nbsp; log posixFilePath<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;">--</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;">-- Construct a FineReader file name from our file</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;">--</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> posixOcrFilePath <span style="color: #ff0033; font-weight: bold;">to</span> getPosixOcrFilePath<span style="color: #000000;">&#40;</span>posixFilePath<span style="color: #000000;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;">--</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;">-- See if the FineReader file we constructed exists</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;">--</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> ocrFileExists <span style="color: #ff0033; font-weight: bold;">to</span> posixFileExists<span style="color: #000000;">&#40;</span>posixOcrFilePath<span style="color: #000000;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">tell</span> <span style="color: #0066ff;">me</span> <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #ff0033; font-weight: bold;">set</span> fileCreator <span style="color: #ff0033; font-weight: bold;">to</span> getSpotlightInfo for <span style="color: #009900;">&quot;kMDItemCreator&quot;</span> <span style="color: #ff0033; font-weight: bold;">from</span> posixFilePath<br />
&nbsp; &nbsp; &nbsp; &nbsp; log <span style="color: #000000;">&#40;</span><span style="color: #009900;">&quot;Creator: &quot;</span> <span style="color: #000000;">&amp;</span> fileCreator<span style="color: #000000;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">if</span> <span style="color: #ff0033;">not</span> ocrFileExists <span style="color: #ff0033;">and</span> fileCreator <span style="color: #000000;">=</span> <span style="color: #009900;">&quot;ScanSnap Manager&quot;</span> <span style="color: #ff0033; font-weight: bold;">then</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">return</span> <span style="color: #0066ff;">POSIX file</span> posixFilePath<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">end</span> <span style="color: #ff0033; font-weight: bold;">if</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">end</span> <span style="color: #ff0033; font-weight: bold;">repeat</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">return</span> <span style="color: #009900;">&quot;&quot;</span><br />
<span style="color: #ff0033; font-weight: bold;">end</span> getNextFile<br />
<span style="color: #808080; font-style: italic;">(*<br />
Name: getSpotlightInfo<br />
Description: Gets a named attribute from metadata for a specific file.<br />
Parameters:<br />
&nbsp; &nbsp; for myattribute - the name of the attribute<br />
&nbsp; &nbsp; from myfile - the name of the file<br />
Returns: the attribute value or &quot;&quot; if none found<br />
*)</span><br />
<span style="color: #ff0033; font-weight: bold;">on</span> getSpotlightInfo for myattribute <span style="color: #ff0033; font-weight: bold;">from</span> myfile<br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">try</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> this_kMDItemResult <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #009900;">&quot;&quot;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">tell</span> <span style="color: #0066ff;">application</span> <span style="color: #009900;">&quot;Finder&quot;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> this_item <span style="color: #ff0033; font-weight: bold;">to</span> myfile <span style="color: #ff0033;">as</span> <span style="color: #0066ff;">string</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> this_item <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #0066ff;">POSIX path</span> <span style="color: #ff0033; font-weight: bold;">of</span> this_item<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> this_kMDItem <span style="color: #ff0033; font-weight: bold;">to</span> myattribute<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> theResult <span style="color: #ff0033; font-weight: bold;">to</span> words <span style="color: #ff0033; font-weight: bold;">of</span> <span style="color: #000000;">&#40;</span><span style="color: #0066ff;">do shell script</span> <span style="color: #009900;">&quot;/usr/bin/mdls -name &quot;</span> <span style="color: #000000;">&amp;</span> this_kMDItem <span style="color: #000000;">&amp;</span> <span style="color: #009900;">&quot; -raw -nullMarker None &quot;</span> <span style="color: #000000;">&amp;</span> <span style="color: #0066ff;">quoted form</span> <span style="color: #ff0033; font-weight: bold;">of</span> this_item<span style="color: #000000;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; log <span style="color: #009900;">&quot;Result: &quot;</span> <span style="color: #000000;">&amp;</span> theResult <span style="color: #ff0033;">as</span> <span style="color: #0066ff;">string</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">repeat</span> <span style="color: #ff0033; font-weight: bold;">with</span> j <span style="color: #ff0033; font-weight: bold;">from</span> <span style="color: #000000;">1</span> <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #0066ff;">number</span> <span style="color: #ff0033; font-weight: bold;">of</span> <span style="color: #0066ff;">items</span> <span style="color: #ff0033; font-weight: bold;">in</span> theResult<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> this_kMDItemResult <span style="color: #ff0033; font-weight: bold;">to</span> this_kMDItemResult <span style="color: #000000;">&amp;</span> <span style="color: #0066ff;">item</span> j <span style="color: #ff0033; font-weight: bold;">of</span> theResult <span style="color: #ff0033;">as</span> <span style="color: #0066ff;">string</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">if</span> j <span style="color: #000000;">&lt;</span> <span style="color: #0066ff;">number</span> <span style="color: #ff0033; font-weight: bold;">of</span> <span style="color: #0066ff;">items</span> <span style="color: #ff0033; font-weight: bold;">in</span> theResult <span style="color: #ff0033; font-weight: bold;">then</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> this_kMDItemResult <span style="color: #ff0033; font-weight: bold;">to</span> this_kMDItemResult <span style="color: #000000;">&amp;</span> <span style="color: #009900;">&quot; &quot;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">end</span> <span style="color: #ff0033; font-weight: bold;">if</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">end</span> <span style="color: #ff0033; font-weight: bold;">repeat</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">end</span> <span style="color: #ff0033; font-weight: bold;">tell</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">on</span> <span style="color: #ff0033; font-weight: bold;">error</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> this_kMDItemResult <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #009900;">&quot;&quot;</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">end</span> <span style="color: #ff0033; font-weight: bold;">try</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">return</span> this_kMDItemResult<br />
<span style="color: #ff0033; font-weight: bold;">end</span> getSpotlightInfo<br />
<span style="color: #808080; font-style: italic;">(*<br />
Name: logEvent<br />
Description: Write an event to an event log<br />
Parameters:<br />
&nbsp; &nbsp; themessage - the message to write to the log<br />
*)</span><br />
<span style="color: #ff0033; font-weight: bold;">on</span> logEvent<span style="color: #000000;">&#40;</span>themessage<span style="color: #000000;">&#41;</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> theLine <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #000000;">&#40;</span><span style="color: #0066ff;">do shell script</span> ¬<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #009900;">&quot;date &nbsp;+'%Y-%m-%d %H:%M:%S'&quot;</span> <span style="color: #ff0033;">as</span> <span style="color: #0066ff;">string</span><span style="color: #000000;">&#41;</span> ¬<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #000000;">&amp;</span> <span style="color: #009900;">&quot; &quot;</span> <span style="color: #000000;">&amp;</span> themessage<br />
&nbsp; &nbsp; <span style="color: #0066ff;">do shell script</span> <span style="color: #009900;">&quot;echo &quot;</span> <span style="color: #000000;">&amp;</span> theLine <span style="color: #000000;">&amp;</span> ¬<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #009900;">&quot; &gt;&gt; ~/Library/Logs/AppleScript-events.log&quot;</span><br />
<span style="color: #ff0033; font-weight: bold;">end</span> logEvent</div></td></tr></tbody></table></div>
<p><strong>Installation</strong></p>
<ul>
<li>Use the Script Editor to save this script as <strong>Run OCR on New Folder Items</strong> under <strong><em>User Home</em>/Library/Scripts/Folder Action Scripts</strong><br />
You may have to create the <strong>Folder Action Scripts</strong> folder.</li>
<li>Now open a Finder window and navigate to the parent folder of your scanned documents folder.</li>
<li>Right click (control-click) the scanned documents folder and choose <strong>Folder Actions Setup&#8230;</strong></li>
<li>At this point if folder actions are not enabled, you will likely have to enable them and add the script manually.
<ul>
<li> Check <strong>Enable Folder Actions</strong></li>
<li>Use the &#8220;+&#8221; buttons on the left and right sides to add the scan folder and then this script.</li>
</ul>
</li>
<li>Otherwise, a list of scripts will come up. Choose this script from the <strong>Choose a Script to Attach</strong> dialog.</li>
<li>Close all windows.</li>
</ul>
<p>That&#8217;s it! The script will be invoked automatically every time a new file appears in your scanned documents folder.</p>
<p>Please let me know if you have any ideas that can improve this script. I&#8217;m not an AppleScript guru, so someone might just know how to keep that annoying Finereader icon from jumping.</p>
]]></content:encoded>
			<wfw:commentRss>http://paperjammed.com/2010/01/04/automate-scansnap-ocr-process-on-your-mac-with-applescript-snow-leopard-edition/feed/</wfw:commentRss>
		<slash:comments>18</slash:comments>
		</item>
		<item>
		<title>Keeping your secrets to yourself—old changes lingering in your PDF files</title>
		<link>http://paperjammed.com/2009/11/23/keeping-your-secrets-to-yourself-old-changes-lingering-in-your-pdf-files/</link>
		<comments>http://paperjammed.com/2009/11/23/keeping-your-secrets-to-yourself-old-changes-lingering-in-your-pdf-files/#comments</comments>
		<pubDate>Tue, 24 Nov 2009 04:46:58 +0000</pubDate>
		<dc:creator>Tad</dc:creator>
				<category><![CDATA[Security]]></category>
		<category><![CDATA[Software]]></category>
		<category><![CDATA[Data Loss]]></category>
		<category><![CDATA[Geeky]]></category>
		<category><![CDATA[PDF]]></category>

		<guid isPermaLink="false">http://paperjammed.com/?p=781</guid>
		<description><![CDATA[A few months ago I wrote an article that touched upon the problems inherent in attempts to sanitize documents before sending them to the enemy—perhaps to remove competitor&#8217;s names or trade secrets. I was reading a post on a board I frequent where a person was describing exactly this kind of activity—removing sensitive information from [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignright size-medium wp-image-791" title="Rusty trap" src="http://paperjammed.com/wp-content/uploads/2009/11/iStock_000011076402XSmall-300x225.jpg" alt="Rusty trap" width="300" height="225" />A few months ago I wrote an article that touched upon <a href="http://paperjammed.com/2009/04/21/keeping-your-secrets-to-yourself—what-can-your-shared-documents-tell-others/">the problems inherent in attempts to sanitize documents</a> before sending them to the enemy—perhaps to remove competitor&#8217;s names or trade secrets.</p>
<p>I was reading a post on a board I frequent where a person was describing exactly this kind of activity—removing sensitive information from PDF documents. Several suggestions were made, but one individual suggested opening the file in Acrobat Pro and replacing the sensitive text with good old <a href="http://www.lipsum.com/">Lorem Ipsum</a>.</p>
<p>It was at that moment that I recalled a peculiar feature of the PDF file format: it is designed to support nondestructive updates, allowing people to make vast changes to a PDF document while still retaining the original document, fully intact. I did a few experiments and was surprised with the results.<span id="more-781"></span></p>
<p><strong>A Brief Note on the PDF File Format</strong></p>
<p>For the geeky types among us, one place to begin is this article:</p>
<p><a href="http://www.mactech.com/articles/mactech/Vol.15/15.09/PDFIntro/">Portable Document Format: An Introduction for Programmers</a></p>
<p>The key points to get out of the article is this: A PDF document is comprised of several distinct sections, a <strong>Header</strong>, a <strong>Body</strong>, an <strong>&#8220;xref&#8221; Table</strong>, and a <strong>Trailer</strong>. At the very end of the file you will find the character sequence <strong>%%EOF</strong></p>
<p>The PDF standard was designed to allow multiple updates to a document, while retaining the original version. This is accomplished by appending anything new to the end of the document, after the original <strong>EOF</strong> tag. The document will now have two <strong>EOF</strong> tags: one indicating where the original document ended, and a new <strong>EOF</strong> tag indicating where the new changes end.</p>
<p>If we wish to revert PDF changes, it should be a simple matter of opening the PDF file in a binary editor, searching for the first <strong>EOF</strong> tag, and deleting everything following.</p>
<p><strong>A Simple Experiment</strong></p>
<p>Let&#8217;s start with a proper secret document containing missile plans&#8230;</p>
<p><img class="alignnone size-full wp-image-785" title="20091123-missile-plans-1" src="http://paperjammed.com/wp-content/uploads/2009/11/20091123-missile-plans-1.gif" alt="20091123-missile-plans-1" width="439" height="418" /></p>
<p>Suppose we want to obscure some special information in paragraph 37. We can open the file in Acrobat Professional and use its text editing features to swap in the venerable <em>Lorem Ipsum</em> text.</p>
<p>Here&#8217;s what it looks like after the switch:</p>
<p><img class="alignnone size-full wp-image-786" title="20091123-lorem-ipsum" src="http://paperjammed.com/wp-content/uploads/2009/11/20091123-lorem-ipsum.gif" alt="20091123-lorem-ipsum" width="598" height="243" /></p>
<p>You can see here that the first seven lines of text starting on paragraph 37 have been replaced with appropriate unreadable text.</p>
<p>Now, open the new PDF file in a binary editor (since PDF files contain a mix of text and binary, the editor must be a binary editor).</p>
<p><img class="alignnone size-full wp-image-787" title="20091123-binary-editor" src="http://paperjammed.com/wp-content/uploads/2009/11/20091123-binary-editor.gif" alt="20091123-binary-editor" width="693" height="633" /></p>
<p>Note the <strong>%%EOF</strong> character sequence embedded in the text. This is the first <strong>EOF</strong> tag, indicating where the original file ended. All we need to do is place the cursor to the right of the <strong>EOF</strong> and delete everything to the end of the file.</p>
<p>Once we have done so, it&#8217;s like magic:</p>
<p><img class="alignnone size-full wp-image-788" title="20091123-after-binary-editing" src="http://paperjammed.com/wp-content/uploads/2009/11/20091123-after-binary-editing.gif" alt="20091123-after-binary-editing" width="794" height="323" /></p>
<p>The edits that replaced lines of paragraph 37 with gibberish have neatly been undone!</p>
<p><strong>More Details</strong></p>
<p>From the <a href="http://www.mactech.com/articles/mactech/Vol.15/15.09/PDFIntro/">PDF Intro document</a> linked earlier:</p>
<p>&#8220;The trailer, it turns out, plays an important role in the way PDF implements incremental updating. The key concept to understand here is that a PDF file is never overwritten, only added to. That goes for all portions of the PDF file &#8211; even the trailer itself, and the end-of-file marker. In other words, a multiply-updated PDF document may contain multiple trailers &#8211; and multiple end-of-file markers! (There may be numerous occurrences of %%EOF.) Each time the file is edited, an addendum is written to the tail of the file, consisting of the content objects that have changed, a new xref section, and a new trailer containing all the information that was in the previous trailer, as well as a /Prev key specifying the byte offset (from the beginning of the file) of the previous xref section. The cross-reference info will then be distributed across more than one xref section. To access all of the cross-references, the reader must walk the list of /Prev keys in all the trailers, in reverse order.</p>
<p>Space doesn&#8217;t permit a detailed exploration of updates here, but you can find several examples in Appendix A of the PDF 1.3 specification (available at <a href="http://partners.adobe.com/asn/developer">http://partners.adobe.com/asn/developer</a>).&#8221;</p>
<p><strong>Summary</strong></p>
<p>It is important to understand that the PDF standard allows for appended updates to files that leave the original document intact, regardless of how drastic the changes are. If you are intent on redacting text from PDF documents, do not depend on simply deleting the secrets using a PDF editor—you must use a proper redaction tool that addresses these issues correctly.</p>
<p>That said, I did some experimenting with a few utilities (Apple Preview, PDFpen, and Adobe Acrobat Pro) and found that some write the file from scratch each time, with no lingering cruft from former versions, while others respect the original intent of the PDF standard. This means that you can&#8217;t trust that older revisions are being retained in your file and you can&#8217;t trust that they aren&#8217;t.</p>
<p>Be conservative: use a redaction tool for secrecy and proper backups for versioning.</p>
]]></content:encoded>
			<wfw:commentRss>http://paperjammed.com/2009/11/23/keeping-your-secrets-to-yourself-old-changes-lingering-in-your-pdf-files/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Dodged the corrupt-document bullet this time, just barely&#8230;</title>
		<link>http://paperjammed.com/2009/10/27/dodged-the-corrupt-document-bullet-this-time-just-barely/</link>
		<comments>http://paperjammed.com/2009/10/27/dodged-the-corrupt-document-bullet-this-time-just-barely/#comments</comments>
		<pubDate>Tue, 27 Oct 2009 21:52:30 +0000</pubDate>
		<dc:creator>Tad</dc:creator>
				<category><![CDATA[Searching and Indexing]]></category>
		<category><![CDATA[Software]]></category>
		<category><![CDATA[Data Loss]]></category>
		<category><![CDATA[Geeky]]></category>
		<category><![CDATA[Indexing]]></category>
		<category><![CDATA[Knowledge Management]]></category>
		<category><![CDATA[PDF]]></category>

		<guid isPermaLink="false">http://paperjammed.com/?p=750</guid>
		<description><![CDATA[A couple of weeks ago, a co-worker sent me a PDF document to look at. He said that he was having trouble copying and pasting from the document and was scratching his head about why this particular PDF would have such issues. As it would turn out, there were several thousand other documents on a [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignright size-medium wp-image-751" title="gibberish document in a file folder" src="http://paperjammed.com/wp-content/uploads/2009/10/iStock_000006486654XSmall-300x199.jpg" alt="gibberish document in a file folder" width="300" height="199" />A couple of weeks ago, a co-worker sent me a PDF document to look at. He said that he was having trouble copying and pasting from the document and was scratching his head about why this particular PDF would have such issues.</p>
<p>As it would turn out, there were several thousand other documents on a file server that shared the same funny behavior. By the time we were done struggling with this problem I had gained new respect for PDF corruption issues and their prevention.<span id="more-750"></span></p>
<p><strong>The Problem</strong></p>
<p>We were looking to load a few thousand of these scientific reports into a fancy-schmancy new database, with linguistics searching and other bells and whistles. Much to our chagrin, these documents just weren&#8217;t loading, and we couldn&#8217;t understand why. They were text documents, with some embedded images, but mostly straightforward text.</p>
<p>Here is an excerpt:</p>
<p><img class="alignnone size-full wp-image-755" title="20091027-plaintext" src="http://paperjammed.com/wp-content/uploads/2009/10/20091027-plaintext.gif" alt="20091027-plaintext" width="521" height="93" /></p>
<p>And you can tell that it is right and proper text because when I blow it up all the way, the fonts are nice and smooth—this isn&#8217;t just an image of text.</p>
<p><img class="alignnone size-full wp-image-756" title="20091027-smooth-letter" src="http://paperjammed.com/wp-content/uploads/2009/10/20091027-smooth-letter.gif" alt="20091027-smooth-letter" width="258" height="295" /></p>
<p>But if I copy and paste that particular paragraph into any handy editor (Notepad, in this case), this is what I see:</p>
<p><img class="alignnone size-full wp-image-757" title="20091027-notepad" src="http://paperjammed.com/wp-content/uploads/2009/10/20091027-notepad.gif" alt="20091027-notepad" width="496" height="155" /></p>
<p>And as far as I know, at this point the actual text is beyond the reach of average folks like me. We tried, believe me we tried.</p>
<p><strong>What went wrong?</strong></p>
<p>A quick Google of the subject led us to understand that many PDF generation tools embed subsets of fonts, with nonstandard mappings from the text to the font.</p>
<p>This fellow explains it nicely:</p>
<p>&#8220;The PDF file does not contain all the information to extract the text. The problem is that a character in a PDF file may not contain information what &#8220;real&#8221; character it relates to. Some PDF generators do a pretty bad job when they embed fonts into PDF files. They use a proprietary encoding mechanism (e.g. 1 is A, 2 is B, 3 is C, &#8230;) in both the embedded font and when they place glyphs on the page. Without a table that implements the reverse (e.g. character code 1 is &#8216;A&#8217;) you cannot extract text from such a file.</p>
<p>There is nothing you can do (besides to complain to whoever created the PDF file, and the author of the software that created this file).&#8221;<br />
— from <a href="http://www.experts-exchange.com/Web_Development/Document_Imaging/Adobe_Acrobat/Q_21426533.html">khkremer on experts-exchange.com</a></p>
<p>As it would turn out, many of the reports had been generated by printing to Adobe Distiller from Microsoft Word. It would seem that the default settings used for Distiller included the &#8220;totally hose my document content&#8221; switch.</p>
<p><strong>The Solution</strong></p>
<p>We fretted over this quite a bit. These are important scientific reports, and there is no way to easily ungarble them. We finally ended up contacting the <a href="http://finereader.abbyy.com/">Abbyy Finereader</a> folks and trying out their OCR toolkit for Linux: not only did this product make fast work of running optical character recognition on the sample document, but once we had a script running, we managed to blow through the 10,000 pages the trial license gave us, in a day or two.</p>
<p><strong>Imperfect, at best</strong></p>
<p>I am happy that we were able to salvage the bulk of the electronic knowledge found within those thousands of files, but our work barely scratched the surface.</p>
<p>For example, most of these documents have rich bookmarking of sections and keywording, such as this (content tastefully blurred on purpose).</p>
<p><img class="alignnone size-full wp-image-760" title="20091027-doc-with-contents" src="http://paperjammed.com/wp-content/uploads/2009/10/20091027-doc-with-contents.gif" alt="20091027-doc-with-contents" width="500" height="348" /></p>
<p>In addition, scientific documents typically have loads of tables full of numbers. Though it is possible to mine this data with a good OCR tool (the FineReader API provides tools for just this purpose), the tables are far more difficult to extract correctly once the original text information is lost.</p>
<p><strong>Final thoughts</strong></p>
<p>I wrote a few weeks about document formats, <a href="http://paperjammed.com/2009/09/29/are-your-portable-document-format-files-all-that/">mentioning the PDF/A document standard</a>. This is worth investigating, regardless of what your document needs are.</p>
<p>If our thousands of files had been originally generated as PDF/A, it is certain that we would have been able to copy/paste from them without problem: PDF/A prohibits such font shenanigans as were perpetrated on our garbled reports.</p>
<p>In the end, our OCR sledgehammer approach worked like a charm, and is probably sufficient for our needs. Text mining is a pretty slushy business, so no-one will complain if there are a few typos on each page—if they find the doc in a search, they can print it and read it the old fashioned way.</p>
]]></content:encoded>
			<wfw:commentRss>http://paperjammed.com/2009/10/27/dodged-the-corrupt-document-bullet-this-time-just-barely/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Why not try a personal Wiki for some of your more amorphous notes?</title>
		<link>http://paperjammed.com/2009/10/12/why-not-try-a-personal-wiki-for-some-of-your-more-amorphous-notes/</link>
		<comments>http://paperjammed.com/2009/10/12/why-not-try-a-personal-wiki-for-some-of-your-more-amorphous-notes/#comments</comments>
		<pubDate>Tue, 13 Oct 2009 03:59:04 +0000</pubDate>
		<dc:creator>Tad</dc:creator>
				<category><![CDATA[Paperless Life]]></category>
		<category><![CDATA[Searching and Indexing]]></category>
		<category><![CDATA[Software]]></category>
		<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Geeky]]></category>
		<category><![CDATA[Knowledge Management]]></category>
		<category><![CDATA[Media]]></category>
		<category><![CDATA[Networking]]></category>
		<category><![CDATA[Tools of the Trade]]></category>
		<category><![CDATA[Windows]]></category>

		<guid isPermaLink="false">http://paperjammed.com/?p=706</guid>
		<description><![CDATA[In my evenings, I sometimes find myself performing the role of &#8220;Resident Geek&#8221; at my nephew&#8217;s school, tending to network issues, computer problems, and my favorite, &#8220;The Internet is down!&#8221; Over the past couple of years I have considered several different approaches for keeping a grip on which computers had which service patch, which router [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignright size-medium wp-image-736" src="http://paperjammed.com/wp-content/uploads/2009/10/iStock_000008986250XSmall-300x199.jpg" alt="" width="300" height="199" />In my evenings, I sometimes find myself performing the role of &#8220;Resident Geek&#8221; at my nephew&#8217;s school, tending to network issues, computer problems, and my favorite, &#8220;The Internet is down!&#8221;</p>
<p>Over the past couple of years I have considered several different approaches for keeping a grip on which computers had which service patch, which router is getting flaky, and which cable connects the library to the classroom at the end of the hall.</p>
<p>I have tried Excel spreadsheets, an Access database, even a spiral-bound notebook—none of them made the job any easier. A few weeks ago I thought about trying a <a href="http://en.wikipedia.org/wiki/Wiki">Wiki</a> and this has turned out to be a perfect fit!</p>
<p>If you are looking to keep a loose scrapbook of notes with lots of arbitrary categories and relationships between them, a wiki might do the trick. In this article I&#8217;ll cover two simple freeware wikis you can carry around on a thumb drive.<span id="more-706"></span></p>
<p><strong>What&#8217;s in a Wiki?</strong></p>
<p>All of us have used Wikipedia at one time or another, and though it may be regarded with disdain by high school teachers, when you consider how it works, Wikipedia is an amazing achievement. But what is the nature of a wiki?</p>
<p>One of the key features is that any page can be easily edited at any time (of course this can be limited by permissions). Another attribute is the ability to breathe life into a new page just by calling its name.</p>
<p>Between these two features, you get the essence of wiki-ness.</p>
<p>For example, if I have a page that discusses North American bears, I can type in a list of bears in a special format, often in jammed-together <a href="http://en.wikipedia.org/wiki/CamelCase">Wiki Words</a>, like this:</p>
<ul>
<li><span style="color: #3366ff;"><strong>GrizzleyBear</strong></span></li>
<li><span style="color: #3366ff;"><strong>BlackBear</strong></span></li>
<li><span style="color: #3366ff;"><strong>BrownBear</strong></span></li>
</ul>
<p>As soon as I save the page, those bear names become hyperlinks. Even though I haven&#8217;t written any pages about the individual bears, whenever it finally suits me, I can click on <span style="color: #3366ff;"><strong>BlackBear </strong></span>and accept the invitation to &#8220;Create a new page called <span style="color: #3366ff;"><strong>BlackBear</strong></span>&#8221;</p>
<p>Better still, a friend who knows about black bears might click on <span style="color: #3366ff;"><strong>BlackBear </strong></span>and write a beautiful page about the animals.</p>
<p>That&#8217;s what wikis are all about.</p>
<p><strong>Back to the School Computers</strong></p>
<p>In a matter of minutes I was able to make a page that described the building and listed the various rooms in the building. I was able to then click on each room and &#8220;auto-vivify&#8221; a page for the room.</p>
<p>From that point, it was easy to create custom pages for each computer in the building, with each page listing the machine&#8217;s stats. I also created pages for each network switch or router.</p>
<p>In a matter of two or three evenings I had the skeleton of a solid knowledge base populated—it&#8217;s a pretty fancy looking web site with dozens of pages that took little effort to put together.</p>
<p>Last night I noticed that one of the machines wasn&#8217;t connecting to the Internet, though it connects fine to internal servers. I popped open its page on the wiki and added a simple note at the bottom of the page:</p>
<p><tt>2009-10-11 - This machine isn't able to connect to the Internet. Not sure why. It connects fine to internal servers.</tt></p>
<p>A few weeks ago I replaced a fan in a network switch. An easy annotation on the wiki page for that device.</p>
<p><strong>Personal Wikis</strong></p>
<p>There are many uses for personal wikis, mostly centered around <a href="http://en.wikipedia.org/wiki/Personal_knowledge_management">personal knowledge management</a> and <a href="http://en.wikipedia.org/wiki/Personal_information_management">personal information management</a>. People use wikis as a replacement for time and task management tools, as a place for gathering thoughts, as a sort of amorphous database, and many other things.</p>
<p>There are many different personal wikis available—here&#8217;s a <a href="http://en.wikipedia.org/wiki/Personal_wiki#Free_software">short list of free ones</a>. One nice simple wiki to try is <a href="http://en.wikipedia.org/wiki/TiddlyWiki">TiddlyWiki</a>. If you are looking for something with a bit more substance, you can try a portable version of <a href="http://en.wikipedia.org/wiki/MediaWiki">MediaWiki</a>—the engine behind Wikipedia—that runs off your thumb drive.</p>
<p><strong>TiddlyWiki</strong></p>
<p>This afternoon I downloaded the flyweight portable wiki called TiddlyWiki. This is an amazingly tight little application—it comes in the form of a single fat web page that you copy to your thumb drive. As you make edits to your TiddlyWiki, the single html page is saved with your changes. Since it&#8217;s a single fancy file, backups are dead easy.</p>
<p>Here&#8217;s what it looks like when you first launch the &#8220;empty.html&#8221; file:</p>
<p><img class="alignnone size-medium wp-image-718" src="http://paperjammed.com/wp-content/uploads/2009/10/20091012-tiddly1-300x161.png" alt="" width="300" height="161" /></p>
<p>After a half hour of twiddling around, I had thrown together this basic set of &#8220;Tiddlers&#8221;</p>
<p><img class="alignnone size-full wp-image-720" src="http://paperjammed.com/wp-content/uploads/2009/10/20091012-tiddly2.png" alt="" width="626" height="720" /></p>
<p>In this screen shot you can see that there are now links that bring up custom &#8220;Tiddlers&#8221; for each computer and for each room. I have opened one of the little pages for <span style="color: #3366ff;"><strong>Computer21</strong></span>.</p>
<p>They describe these pages as being comparable to note cards. All in all, it is tight and easy to use.</p>
<p>Want to give it a try? Download it from the <a href="http://www.tiddlywiki.com/">TiddlyWiki</a> site. You really need to play with it to get a feel for what it can do!</p>
<p><strong>MediaWiki</strong></p>
<p>If you are looking for something with a little more meat on it, you can run the Wikipedia engine on your USB drive.</p>
<p>The easiest way to set this up is to let <a href="http://www.chsoftware.net/en/useware/mowes/mowes.htm">MoWeS</a> do everything for you. <strong>MoWeS</strong> stands for <strong>Mo</strong>dular <strong>We</strong>bserver <strong>S</strong>ystem. It&#8217;s a free product that you can configure as a self-contained Apache web server with a variety of cool apps like MediaWiki, running off a thumb drive.</p>
<p>Here&#8217;s how to set up MediaWiki in five minutes:</p>
<ul>
<li>Go to the <a href="http://www.chsoftware.net/en/useware/mowes/download.htm">MoWeS Mixer</a></li>
<li>The first time around choose &#8220;I do not have a <strong>MoWeS Portable II</strong> Package and want to obtain a new package&#8221; when prompted and click <strong>Go</strong>.</li>
<li>On the software lists, check <strong>Apache2</strong>, <strong>MySQL5</strong>, <strong>PHP5</strong>, and <strong>MediaWiki</strong></li>
<li>Click <strong>Download Now</strong></li>
<li>At this point they ask you some kind of question <em>in German</em>, to filter spambots, but it seems to be a simple math problem. Fill in the answer and click <strong>Submit Query</strong><br />
(&#8220;<em>Zum Schutz vor Downloadrobotern geben Sie bitte das Ergebnis dieser Aufgabe ein: 5 + 8 =  ?</em>&#8220;)</li>
<li>Unzip the downloaded zip file,  <strong>mowes_portable.zip</strong>, and copy the files to your USB drive</li>
<li>Open your thumb drive and double-click <strong>mowes.exe</strong></li>
<li>Select your language and accept the license</li>
<li>Click <strong>install</strong>, and confirm when prompted</li>
</ul>
<p>The installation process may take several minutes, but rest assured that it isn&#8217;t installing anything on your computer.</p>
<p><strong>Note: </strong>I received two or three firewall warnings for the Apache web server and the MySQL database. I had to click the &#8220;Unblock&#8221; button for all of them before my new MediaWiki-on-a-stick would work correctly.</p>
<p>After all of the dust settled, I have this little window on my screen:</p>
<p><img class="alignnone size-medium wp-image-725" title="20091012-MoWeS1" src="http://paperjammed.com/wp-content/uploads/2009/10/20091012-MoWeS1-300x209.png" alt="20091012-MoWeS1" width="300" height="209" /></p>
<p>In order to shut down and close out, just click the <strong>End</strong> button.</p>
<p>Once your MediaWiki USB key is running, you can go to this web page:</p>
<p><span style="color: #3366ff;">http://127.0.0.1/mediawiki/index.php/Main_Page</span></p>
<p><img class="alignnone size-full wp-image-726" src="http://paperjammed.com/wp-content/uploads/2009/10/20091012-MoWeS2.png" alt="" width="593" height="524" /></p>
<p>It looks just like Wikipedia, doesn&#8217;t it?</p>
<p>What a truly amazing thing: you can carry around your own Wikipedia server on a USB key and plug it in any random machine and start it up.</p>
<p><strong>Different Wiki Features</strong></p>
<p>As you try out different wiki software, you will notice that there are plenty of differences in the features they support:</p>
<ul>
<li>Each wiki has a different kind of editor. Some are visual; others are simple text editors.</li>
<li>The markup syntax you use for pages is different from wiki to wiki.</li>
<li>Most wikis support features such as &#8220;category pages&#8221; that find all pages tagged with a category.</li>
<li>Some support adding images and other content; others don&#8217;t. I imagine that TiddlyWiki probably has some means of embedding images, but I couldn&#8217;t find it.</li>
<li>A quick glance at the MediaWiki screenshot above shows extended features such as the Discussion tab and the History tab.</li>
<li>Some use the filesystem for their pages; others use a database.</li>
</ul>
<p>Since I wanted a central wiki for the whole school, I chose a different product from the portable wikis I discussed here—I decided to run <a href="http://moinmo.in/">MoinMoin</a> on a <a href="http://www.ubuntu.com/">Ubuntu</a> installation on an aging Gateway desktop machine. Nevertheless, the basic idea is still the same.</p>
<p>Once that arrangement becomes a little more stable I&#8217;ll write up a howto document, like the <a href="http://paperjammed.com/2009/02/15/new-life-for-an-old-clunker/">Linux NAS</a> one from a few months back.</p>
<p><strong>Other Sources</strong></p>
<p>There are loads of different personal wiki options out there and many people have written how-to documents and tutorials. Here&#8217;s a few:</p>
<ul>
<li><a href="http://lifehacker.com/354005/run-your-personal-wikipedia-from-a-usb-stick">Run Your Personal Wikipedia from a USB Stick</a> (Lifehacker.com)</li>
<li><a href="http://lifehacker.com/163707/geek-to-live--set-up-your-personal-wikipedia">Geek to Live: Set up your personal Wikipedia</a> (Lifehacker.com)</li>
<li><a href="http://www.pmwiki.org/wiki/Cookbook/WikiOnAStick">Wiki On A Stick</a> (PmWiki.org)</li>
<li><a href="http://cplus.about.com/od/thebusinessofsoftware/ss/woas.htm">Getting Started with Wiki on a Stick</a> (About.com)</li>
<li><a href="http://www.giffmex.org/twfortherestofus.html">TiddlyWiki for the rest of us</a> (giffmex)</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://paperjammed.com/2009/10/12/why-not-try-a-personal-wiki-for-some-of-your-more-amorphous-notes/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>When migrating to a new operating system, Look Before You Leap!</title>
		<link>http://paperjammed.com/2009/09/07/when-migrating-to-a-new-operating-system-look-before-you-leap/</link>
		<comments>http://paperjammed.com/2009/09/07/when-migrating-to-a-new-operating-system-look-before-you-leap/#comments</comments>
		<pubDate>Tue, 08 Sep 2009 02:56:21 +0000</pubDate>
		<dc:creator>Tad</dc:creator>
				<category><![CDATA[Backups]]></category>
		<category><![CDATA[Software]]></category>
		<category><![CDATA[Data Loss]]></category>
		<category><![CDATA[Geeky]]></category>
		<category><![CDATA[Macintosh]]></category>
		<category><![CDATA[Tips]]></category>

		<guid isPermaLink="false">http://paperjammed.com/?p=676</guid>
		<description><![CDATA[I can&#8217;t help it. As soon as I hear of a new version of anything, whether it&#8217;s an application or the entire operating system, I have to install it. Now prudence would lead one to take careful steps and wait until all of the wrinkles are ironed out before starting. I was almost not prudent [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignright size-medium wp-image-685" src="http://paperjammed.com/wp-content/uploads/2009/09/iStock_000005873765XSmall-241x300.jpg" alt="" width="241" height="300" />I can&#8217;t help it. As soon as I hear of a new version of <em>anything</em>, whether it&#8217;s an application or the entire operating system, I have to install it.</p>
<p>Now prudence would lead one to take careful steps and wait until all of the wrinkles are ironed out before starting. I was almost not prudent enough this week.</p>
<p><strong>Mac OS X Snow Leopard</strong></p>
<p>So folks have been talking about the new <a href="http://en.wikipedia.org/wiki/Mac_OS_X_v10.6">Snow Leopard</a> operating system for Mac. Over the past year, Apple has been positioning this version as more of a &#8220;under the hood&#8221; upgrade that tightens things up rather than a glitzy overhaul of the user interface. No matter what they said it was, I figured that it was newer, and therefore better, than the current OS—Leopard–and I had to have it.</p>
<p>I ordered my copy last week on Amazon and sat down with a smile as I awaited its arrival. And then I thought about doing a few quick Googles to see how other people have been making out with Snow Leopard. I immediately happened upon a few upgrade guides <a href="http://www.cultofmac.com/how-to-upgrade-to-snow-leopard-the-right-way/15141">like this one</a>, providing sage advice about the upgrade process. They recommended the &#8220;slash and burn&#8221; method, starting from a clean hard drive, and I felt that was a good idea. Nothing better than a wipe and fresh install to make your machine zip along twice as fast. And therein lies a tale.<span id="more-676"></span></p>
<p><strong>The first sign of trouble</strong></p>
<p>As I was reading up on the Snow Leopard upgrade process, I happened upon lists of &#8220;unsupported software&#8221; and casually glanced at the lists, expecting esoteric tools only used by three über geeks in the audio recording industry or perhaps some exotic ray-tracing software. Much to my surprise, I saw two of my favorite applications, in a very very short list of troublesome apps: <a href="http://en.wikipedia.org/wiki/Parallels_Desktop_for_Mac">Parallels</a> and <a href="http://www.elgato.com/elgato/na/mainmenu/home/what-is-eyetv.en.html">EyeTV</a>.</p>
<p>I immediately checked the versions and breathed a sigh of relief when I saw that my EyeTV version was safe. But, Parallels was another story&#8230; They have no plans for patching Parallels 3 to work with Snow Leopard, and why should they, when they can sell us Parallels 4!</p>
<p>So, I ordered my fresh copy of Parallels 4, from Amazon with a twenty dollar rebate. When it arrived, I spent an evening upgrading Parallels, and I thought I was all set for Snow Leopard.</p>
<p><strong>Preventative Measures</strong></p>
<p>Following the advice of the upgrade websites, and prior experience, I used <a href="http://www.bombich.com/software/ccc.html">Carbon Copy Cloner</a> to make a full backup of my hard drive on a spare external drive. On a hunch, I turned on the drive that I use for <a href="http://en.wikipedia.org/wiki/Time_Machine_(Apple_software)">Time Machine</a> and had it do one final &#8220;Time Machine&#8221; sweep through the system before bidding <em>adiu</em> to Leopard.</p>
<p>I knew that I had all of my installation media for stuff like iLife and Photoshop Elements, and I had all of my license keys in electronic form. It would be a simple matter of mounting the backup drive, copying over my loads of documents, and peering into them to find keys.</p>
<p><strong>The first attempt</strong></p>
<p>I boldly inserted the Snow Leopard disk and booted from the DVD drive, selecting the &#8220;Slash and Burn&#8221; method of installation. I reformatted the hard drive and went off for dinner while Snow Leopard installed.</p>
<p><strong>Trouble</strong></p>
<p>When I got home that evening, I started the lengthy process of installing stuff. I suddenly realized that it was not as easy as I had hoped: it&#8217;s one thing to reinstall something like Microsoft Office, but there seemed to be more loose ends than I had considered:</p>
<ul>
<li>How would I migrate my Mail settings from the old image to the new?</li>
<li>What was the best way to migrate the Address Book contents?</li>
<li>iTunes is great, but it has tendrils in everything. Can I simply copy my old library to the new without messing up my iPhone, Address Book, or other linked stuff?</li>
<li>How about those nice password tools such as 1Password and SplashID that keep your passwords safe and sound? I had no clue how to get their contents from the backup. I wasn&#8217;t sure if it was even possible to do so—perhaps I was supposed to have exported the data beforehand.</li>
</ul>
<p>It was becoming clearer to me that I had not done my homework at all.</p>
<p><strong>More trouble</strong></p>
<p>My initial shock at the depth of the upgrade process led me to start making a list of applications and looking at what I needed for each one. I soon found out that Snow Leopard support is somewhat spotty in many applications. In particular, the FineReader for ScanSnap software that I depend on so much for my scanning work flow is <a href="http://www.documentsnap.com/abbyy-finereader-and-snow-leopard-file-not-created-with-scansnap/">not fully supported</a>. Fujitsu says that they will have an update soon and to keep checking their web site.</p>
<p>My password tool, 1Password, is <a href="http://www.switchersblog.com/2009/08/update-1password-on-snow-leopard.html">another problem child</a>. It works only on 32-bit Safari, and Snow Leopard now runs Safari in 64-bit mode. Of course, a new version is coming, and I will probably have to pay for it, but it is still in beta.</p>
<p>There was <a href="http://graphicssoft.about.com/b/2009/08/28/what-about-photoshop-elements-6-in-snow-leopard.htm">quite a bit of chatter</a> on the Web about whether Adobe Photoshop Elements would work on Snow Leopard, and the responses seem split fifty-fifty for now.</p>
<p>Three very important tools were in danger of running in limited mode or not running at all, so I had to throw in the towel.</p>
<p><strong>Time Machine saves the day!</strong></p>
<p>As I sat, humbled, before my vanilla install of Snow Leopard, I admitted defeat. I slipped the Snow Leopard DVD back in the drive and rebooted from the DVD. This time, I selected the &#8220;Restore from Time Machine&#8221; option and turned on my Time Machine drive.</p>
<p>Guess what? It worked perfectly! Unlike many software products, Time Machine does exactly what it promises.</p>
<p>Within a few hours, my machine was fully restored to the way it looked seconds before I made my first attempt at Snow Leopard.</p>
<p><strong>A Final Word</strong></p>
<p>Learn from my mistakes, and my salvation by the full backup. As much as you can&#8217;t wait to upgrade, please do the following:</p>
<ul>
<li>Inventory all of your applications that you really need.</li>
<li>Obtain the installation media (download or CD) for every single one.</li>
<li>Obtain the keys for every single one.</li>
<li>Investigate whether you need to export data from any of them, and make a checklist for these exports prior to upgrade.</li>
<li>Check the &#8220;Unsupported Software&#8221; lists that are out there for any red flags.</li>
<li>Check the web sites of your most important apps for their official word.</li>
<li>And finally, do a complete backup!</li>
</ul>
<p>It&#8217;s amazing how many applications and weird little utilities we forget we have. How could I have possibly remembered that I compiled a custom copy of the &#8220;rsync&#8221; executable for my backup workflow? I would have lost that and had to figure out how to rebuild it on Snow Leopard.</p>
<p>And I haven&#8217;t even talked about making sure your documents make it safely onto the new machine. That&#8217;s a whole &#8216;nother story.</p>
<p>In case I forgot to say it, please make a full backup.</p>
<p><strong>[Update: I'm giving Snow Leopard a rest for a few months]</strong></p>
<p>It has been said that Time Machine allows you to do a full restore from bare metal, and I&#8217;m living proof: I have done exactly that twice in the past week, with astounding success.</p>
<p>Encouraged by an episode of the <a href="http://www.macobserver.com/tmo/features/mac_geek_gab/">Mac Geek Gab</a> where they talked about their experiences upgrading their existing systems to Snow Leopard, I decided I would give the upgrade-in-place option a try. I expected some things to not work well and others to be quirky, but here&#8217;s what happened&#8230;</p>
<p>The actual install was painless, taking an hour or so to complete. I then began to kick the tires to see what was broken.</p>
<p>It was clear where those 64 bits went: apps like Safari were positively zippy, and I was pleasantly surprised with each new application I launched. All of my special settings seemed to make it through alive, including my password manager, though I did have to re-enter some of my registration keys. All of my mail and contacts made it through well. I was able to sync my iPhone without incident.</p>
<p>I found a few apps that weren&#8217;t working correctly and I looked for newer 10.6-compatible versions. I found newer versions of <a href="http://www.ironicsoftware.com/yep/">Yep</a> and <a href="http://alum.hampshire.edu/~bjk02/xGestures/">xGestures</a>.</p>
<p>I did note that there is currently no ad blocker available for Safari that runs in 64-bit mode. This is disappointing because even though I understand that Apple wants us to see <em>their</em> ads, I can&#8217;t imagine that they really want us to suffer from the flickering jumping dreck that should have ended with the hated &#8220;punch the monkey&#8221; banners of years gone by. The fact of the matter is, if I want that 64-bit speed and snap, I guess I have to watch ads.</p>
<p><strong>The Showstopper</strong></p>
<p>I decided to scan a document to see just how difficult it would be to get my workflow going again. Michael F, below, wrote the truth about the situation: the scanner works fine in certain modes, but the OCR software doesn&#8217;t.</p>
<p>He pointed out that it was a problem of the FineReader software looking for a specific bit of metadata in the PDF identifying it as a ScanSnap PDF. Sadly, that metadata string changed.</p>
<blockquote><p>The Finereader software is looking for “Mac OS X 10.5.8 Quartz PDFContext”, but under Snow Leopard, the string is set to “Mac OS X 10.6 Quartz PDFContext” instead.</p></blockquote>
<p>There are ways to tweak PDF metadata, and one of them is by using <a href="http://www.accesspdf.com/pdftk/">pdftk</a>.</p>
<p>I went to the pdftk site, all ready to download it and start OCRing my PDFs. I was greeted with less than optimal news: they have a version compiled for Panther, a version of OS X from several years ago.</p>
<p>I knew it wouldn&#8217;t work, but I gave it a try anyway: the app told me it needed Rosetta to run. I could have installed Rosetta at that point, but I figured I wanted a <em>proper</em> compiled version.</p>
<p>From there, I looked into compiling the app on OS X 10.6. I should have remembered my struggles with this several months ago on a Solaris Unix box when I found that pdftk depends on a monster called GCJ that required about forty other software packages to compile—it seemed a gargantuan task that I wasn&#8217;t ready to begin.</p>
<p>On a hunch, I inspected the content of a<em> new</em> pdf and an <em>old</em> pdf, the latter still acceptable to FineReader. Though much of the file was raw binary, the metadata was in text at the end. A short <a href="http://en.wikipedia.org/wiki/Sed">sed</a> script was all it took to swap the nice text string for the offending 10.6 one.</p>
<p>In spite of my best efforts, FineReader still rejected my hand-tooled PDF file. It knew that it was a bogus file.</p>
<p>I have looked into Abbyy FineReader several times before, as well as Fujitsu&#8217;s ScanSnap support, and was unimpressed. For two vendors that produce products that are at the top of their class—FineReader is arguably the best OCR you can get for Mac, and ScanSnap is the best document scanner for the common man—they sure do have miserable customer support.</p>
<p>It is as if neither company cares a whit about the Macintosh platform or their customers. While most other vendors are busily patching their products and giving hourly updates on their Snow Leopard compatibility progress, Abbyy and Fujitsu just don&#8217;t seem to care that their best-of-breed combo suddenly doesn&#8217;t work on Mac.</p>
<p>Once they get this sorted out (hopefully in the next few months) I&#8217;ll give Snow Leopard another try. In the meantime, I&#8217;m sticking with good old Leopard.</p>
]]></content:encoded>
			<wfw:commentRss>http://paperjammed.com/2009/09/07/when-migrating-to-a-new-operating-system-look-before-you-leap/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Automate ScanSnap OCR process on your Mac with AppleScript</title>
		<link>http://paperjammed.com/2009/08/29/automate-scansnap-ocr-process-on-your-mac-with-applescript/</link>
		<comments>http://paperjammed.com/2009/08/29/automate-scansnap-ocr-process-on-your-mac-with-applescript/#comments</comments>
		<pubDate>Sat, 29 Aug 2009 23:50:08 +0000</pubDate>
		<dc:creator>Tad</dc:creator>
				<category><![CDATA[Software]]></category>
		<category><![CDATA[Workflow]]></category>
		<category><![CDATA[Geeky]]></category>
		<category><![CDATA[PDF]]></category>
		<category><![CDATA[Scanning]]></category>
		<category><![CDATA[Scripting]]></category>
		<category><![CDATA[Searching and Indexing]]></category>

		<guid isPermaLink="false">http://paperjammed.com/?p=648</guid>
		<description><![CDATA[Some months back I wrote an article on using scripting languages to glue workflows together. My inspiration for that article was a bit of AppleScript that I had suffered over in order to smooth over a minor annoyance of my scan-to-OCR workflow. I had promised that once I cleaned up the embarrassing bits of code [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignright size-full wp-image-658" src="http://paperjammed.com/wp-content/uploads/2009/08/20090829-applescript.gif" alt="" width="128" height="128" />Some months back I wrote an article on using scripting languages to glue workflows together. My inspiration for that article was a bit of AppleScript that I had suffered over in order to smooth over a minor annoyance of my scan-to-OCR workflow.</p>
<p>I had promised that once I cleaned up the embarrassing bits of code I would post a perfect polished version here, but such promises are rarely fulfilled. A reader posted a comment asking for that source code, so I will post it here in its current state. The truth is, I have been using this script for months and, though it has some quirks, it works fine.</p>
<p>So this post is about Macintosh, AppleScript, and the ScanSnap-to-FineReader workflow. If these don&#8217;t interest you, better move on.</p>
<p><b>Update:</b> The script on this page works only with Leopard (10.5). Get the Snow Leopard version <a href="http://paperjammed.com/2010/01/04/automate-scansnap-ocr-process-on-your-mac-with-applescript-snow-leopard-edition/">here</a><br />
<span id="more-648"></span></p>
<p><strong>The Original Problem</strong></p>
<p>The Fujitsu ScanSnap S510m, my workhorse scanner, was designed to scan documents quickly and generate PDF files—this it does flawlessly. In order to provide OCR support, they have shipped a special version of <a href="http://finereader.abbyy.com/">FineReader</a>, called <strong>FineReader for ScanSnap</strong>. The standard OCR configuration is to chain the output of the scanner to the FineReader program.</p>
<p>The problem is that this forces scanning and OCR to run in lockstep: you scan a document, you wait for OCR, and then you scan another document.</p>
<p>My desire was to write a simple AppleScript that would detach the &#8220;Scan a Document&#8221; process from the &#8220;OCR&#8221; process. By using this script, I can scan documents at whatever rate pleases me, and the OCR engine will chunk along at its own pace, consuming my scanned documents and producing OCR documents.</p>
<p><strong>My Approach</strong></p>
<p>I really looked hard at the OCR application, trying to find AppleScript hooks or special command line switches that might allow me to control it better. Sadly, it was not designed to be scriptable. The only thing I could do is call the FineReader application with a source file.</p>
<p>Given this limitation, I considered writing a script that would look at a particular folder, identifying new files as they appear and passing them on to FineReader.</p>
<p>Fortunately, AppleScript provides this kind of functionality with little effort in the form of <strong>Folder Actions</strong>. Perhaps the best way to see these in action (and try it out) is to see this post on <a href="http://www.tuaw.com/2009/02/16/applescript-exploring-the-power-of-folder-actions-part-i/">Exploring the power of Folder Actions</a>.</p>
<p>In order to achieve my goals, I did the following:</p>
<ul>
<li>Created a folder called &#8220;Pending Documents&#8221;</li>
<li>Wrote the script to find the oldest-unprocessed-file and call FineReader with it</li>
<li>Attached the script to the folder as a Folder Action</li>
</ul>
<p><strong>The Script</strong></p>
<p>Let&#8217;s jump right in to the AppleScript. <a href="http://paperjammed.com/wp-content/uploads/2009/08/Run-OCR-on-New-Folder-Items.scpt">Download the script here.</a></p>
<div class="codecolorer-container applescript default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:435px;height:300px;"><table cellspacing="0" cellpadding="0"><tbody><tr><td style="padding:5px;text-align:center;color:#888888;background-color:#EEEEEE;border-right: 1px solid #9F9F9F;font: normal 12px/1.4em Monaco, Lucida Console, monospace;"><div>1<br />2<br />3<br />4<br />5<br />6<br />7<br />8<br />9<br />10<br />11<br />12<br />13<br />14<br />15<br />16<br />17<br />18<br />19<br />20<br />21<br />22<br />23<br />24<br />25<br />26<br />27<br />28<br />29<br />30<br />31<br />32<br />33<br />34<br />35<br />36<br />37<br />38<br />39<br />40<br />41<br />42<br />43<br />44<br />45<br />46<br />47<br />48<br />49<br />50<br />51<br />52<br />53<br />54<br />55<br />56<br />57<br />58<br />59<br />60<br />61<br />62<br />63<br />64<br />65<br />66<br />67<br />68<br />69<br />70<br />71<br />72<br />73<br />74<br />75<br />76<br />77<br />78<br />79<br />80<br />81<br />82<br />83<br />84<br />85<br />86<br />87<br />88<br />89<br />90<br />91<br />92<br />93<br />94<br />95<br />96<br />97<br />98<br />99<br />100<br />101<br />102<br />103<br />104<br />105<br />106<br />107<br />108<br />109<br />110<br />111<br />112<br />113<br />114<br />115<br />116<br />117<br />118<br />119<br />120<br />121<br />122<br />123<br />124<br />125<br />126<br />127<br />128<br />129<br />130<br />131<br />132<br />133<br />134<br />135<br />136<br />137<br />138<br />139<br />140<br />141<br />142<br />143<br />144<br />145<br />146<br />147<br />148<br />149<br />150<br />151<br />152<br />153<br />154<br />155<br />156<br />157<br />158<br />159<br />160<br />161<br />162<br />163<br />164<br />165<br />166<br />167<br />168<br />169<br />170<br />171<br />172<br />173<br />174<br />175<br />176<br />177<br />178<br />179<br />180<br />181<br />182<br />183<br />184<br />185<br />186<br />187<br />188<br />189<br />190<br />191<br />192<br />193<br />194<br />195<br />196<br />197<br />198<br />199<br />200<br />201<br />202<br />203<br />204<br />205<br />206<br />207<br />208<br />209<br />210<br />211<br />212<br /></div></td><td><div class="applescript codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap"><span style="color: #808080; font-style: italic;">(*<br />
This is a folder listener script that will act as a queue, receiving<br />
PDF files from the ScanSnap scanner and feeding them, one by one, to<br />
the Abbyy FineReader OCR software.<br />
<br />
This allows you to keep scanning while the OCR job runs in the background<br />
on all of the unprocessed files.<br />
<br />
Why do we want to do this?<br />
<br />
The ScanSnap Manager software does not support this by default, so<br />
when you scan in a file, it sends it to FineReader for OCR. You then<br />
must wait until FineReader finishes its work before scanning in another<br />
document.<br />
<br />
This script allows you to keep scanning without waiting for OCR.<br />
<br />
Installation:<br />
<br />
o &nbsp; Copy this script to:<br />
<br />
&nbsp; &nbsp; &lt;home&gt;/Library/Scripts/Folder Action Scripts<br />
<br />
&nbsp; &nbsp; You may have to create the &quot;Folder Action Scripts&quot; folder.<br />
<br />
o &nbsp; Now open a Finder window, control-click and choose:<br />
<br />
&nbsp; &nbsp; More / Configure Folder Actions...<br />
<br />
o &nbsp; Check the &quot;Enable Folder Actions&quot; checkbox, if not checked<br />
o &nbsp; Click the &quot;+&quot; in the bottom left<br />
o &nbsp; Select a folder and click Open<br />
o &nbsp; Choose the script &quot;Run OCR on New Folder Items&quot; and click Attach<br />
<br />
Copyright (C) 2009 Tad Harrison<br />
*)</span><br />
<span style="color: #ff0033; font-weight: bold;">on</span> <span style="color: #0066ff;">adding</span> <span style="color: #0066ff;">folder</span> <span style="color: #0066ff;">items</span> <span style="color: #ff0033; font-weight: bold;">to</span> this_folder <span style="color: #ff0033;">after</span> <span style="color: #0066ff;">receiving</span> added_items<br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">try</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;">-- Just in case FineReader is running, wait until it is ready</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; waitForFineReaderFinish<span style="color: #000000;">&#40;</span><span style="color: #000000;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> moreWorkToDo <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #0066ff;">true</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">repeat</span> <span style="color: #ff0033; font-weight: bold;">while</span> moreWorkToDo<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> aFile <span style="color: #ff0033; font-weight: bold;">to</span> getNextFile<span style="color: #000000;">&#40;</span>this_folder<span style="color: #000000;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">if</span> <span style="color: #ff0033;">not</span> aFile <span style="color: #000000;">=</span> <span style="color: #009900;">&quot;&quot;</span> <span style="color: #ff0033; font-weight: bold;">then</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; log <span style="color: #0066ff;">POSIX path</span> <span style="color: #ff0033; font-weight: bold;">of</span> aFile<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; ocrFile<span style="color: #000000;">&#40;</span>aFile<span style="color: #000000;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">else</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> moreWorkToDo <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #0066ff;">false</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">end</span> <span style="color: #ff0033; font-weight: bold;">if</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">end</span> <span style="color: #ff0033; font-weight: bold;">repeat</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; exitApp<span style="color: #000000;">&#40;</span><span style="color: #009900;">&quot;FineReader for ScanSnap&quot;</span><span style="color: #000000;">&#41;</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">on</span> <span style="color: #ff0033; font-weight: bold;">error</span> errorStr <span style="color: #0066ff;">number</span> errNum<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #0066ff;">display dialog</span> <span style="color: #009900;">&quot;Error &quot;</span> <span style="color: #000000;">&amp;</span> errNum <span style="color: #000000;">&amp;</span> <span style="color: #009900;">&quot; while running OCR: &quot;</span> <span style="color: #000000;">&amp;</span> errorStr<br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">end</span> <span style="color: #ff0033; font-weight: bold;">try</span><br />
<span style="color: #ff0033; font-weight: bold;">end</span> <span style="color: #0066ff;">adding</span> <span style="color: #0066ff;">folder</span> <span style="color: #0066ff;">items</span> <span style="color: #ff0033; font-weight: bold;">to</span><br />
<span style="color: #808080; font-style: italic;">(*<br />
Name: ocrFile<br />
Description: Runs OCR on the next un-OCR'd file<br />
Parameters:<br />
&nbsp; aFile - the file to be OCR'd<br />
*)</span><br />
<span style="color: #ff0033; font-weight: bold;">on</span> ocrFile<span style="color: #000000;">&#40;</span>aFile<span style="color: #000000;">&#41;</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">tell</span> <span style="color: #0066ff;">application</span> <span style="color: #009900;">&quot;FineReader for ScanSnap&quot;</span> <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #0066ff;">open</span> aFile<br />
&nbsp; &nbsp; <span style="color: #808080; font-style: italic;">-- Make sure FineReader actually starts before we start waiting for it to stop</span><br />
&nbsp; &nbsp; waitForFineReaderStart<span style="color: #000000;">&#40;</span><span style="color: #000000;">&#41;</span><br />
&nbsp; &nbsp; <span style="color: #808080; font-style: italic;">-- Now wait 'till it's done so we do one file at a time</span><br />
&nbsp; &nbsp; waitForFineReaderFinish<span style="color: #000000;">&#40;</span><span style="color: #000000;">&#41;</span><br />
<span style="color: #ff0033; font-weight: bold;">end</span> ocrFile<br />
<span style="color: #808080; font-style: italic;">(*<br />
Name: appIsRunning<br />
Description: Determines if a particular application is running.<br />
Parameters:<br />
&nbsp; &nbsp; appName - the name of the application to be tested<br />
Returns: True if the application is running; otherwise False<br />
*)</span><br />
<span style="color: #ff0033; font-weight: bold;">on</span> appIsRunning<span style="color: #000000;">&#40;</span>appName<span style="color: #000000;">&#41;</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">tell</span> <span style="color: #0066ff;">application</span> <span style="color: #009900;">&quot;System Events&quot;</span> <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #000000;">&#40;</span><span style="color: #0066ff;">name</span> <span style="color: #ff0033; font-weight: bold;">of</span> processes<span style="color: #000000;">&#41;</span> <span style="color: #ff0033;">contains</span> appName<br />
<span style="color: #ff0033; font-weight: bold;">end</span> appIsRunning<br />
<span style="color: #808080; font-style: italic;">(*<br />
Name: exitApp<br />
Description: Exits the specified app if it is running.<br />
Parameters:<br />
&nbsp; &nbsp; appName - the application name<br />
*)</span><br />
<span style="color: #ff0033; font-weight: bold;">on</span> exitApp<span style="color: #000000;">&#40;</span>appName<span style="color: #000000;">&#41;</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">if</span> appIsRunning<span style="color: #000000;">&#40;</span>appName<span style="color: #000000;">&#41;</span> <span style="color: #ff0033; font-weight: bold;">then</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">tell</span> <span style="color: #0066ff;">application</span> appName <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #0066ff;">quit</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">end</span> <span style="color: #ff0033; font-weight: bold;">if</span><br />
<span style="color: #ff0033; font-weight: bold;">end</span> exitApp<br />
<span style="color: #808080; font-style: italic;">(*<br />
Name: getNextFile<br />
Description: Finds the next unprocessed ScanSnap PDF<br />
Return: the file or &quot;&quot;<br />
*)</span><br />
<span style="color: #ff0033; font-weight: bold;">on</span> getNextFile<span style="color: #000000;">&#40;</span>aFolder<span style="color: #000000;">&#41;</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> masterFileList <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #0066ff;">list</span> <span style="color: #0066ff;">folder</span> aFolder ¬<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">without</span> <span style="color: #0066ff;">invisibles</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> posixPath <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #0066ff;">POSIX path</span> <span style="color: #ff0033; font-weight: bold;">of</span> aFolder<br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">repeat</span> <span style="color: #ff0033; font-weight: bold;">with</span> i <span style="color: #ff0033; font-weight: bold;">from</span> <span style="color: #000000;">1</span> <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #0066ff;">count</span> masterFileList<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> fileName <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #0066ff;">item</span> i <span style="color: #ff0033; font-weight: bold;">of</span> masterFileList<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> posixFilePath <span style="color: #ff0033; font-weight: bold;">to</span> posixPath <span style="color: #000000;">&amp;</span> fileName<br />
&nbsp; &nbsp; &nbsp; &nbsp; log posixFilePath<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;">--</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;">-- Construct a FineReader file name from our file</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;">--</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> posixBaseName <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #0066ff;">do shell script</span> ¬<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #009900;">&quot;filename=&quot;</span> <span style="color: #000000;">&amp;</span> <span style="color: #0066ff;">quoted form</span> <span style="color: #ff0033; font-weight: bold;">of</span> posixFilePath <span style="color: #000000;">&amp;</span> <span style="color: #009900;">&quot;; echo ${filename%<span style="color: #000000; font-weight: bold;">\\</span>.*}&quot;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; log <span style="color: #000000;">&#40;</span><span style="color: #009900;">&quot;Name: &quot;</span> <span style="color: #000000;">&amp;</span> posixBaseName<span style="color: #000000;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> posixOcrFilePath <span style="color: #ff0033; font-weight: bold;">to</span> posixBaseName <span style="color: #000000;">&amp;</span> <span style="color: #009900;">&quot; processed by FineReader.pdf&quot;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;">--</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;">-- See if the FineReader file we constructed exists</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;">--</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">tell</span> <span style="color: #0066ff;">application</span> <span style="color: #009900;">&quot;System Events&quot;</span> <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #ff0033; font-weight: bold;">set</span> ocrFileExists <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #0066ff;">exists</span> <span style="color: #0066ff;">file</span> posixOcrFilePath<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">if</span> ocrFileExists <span style="color: #ff0033; font-weight: bold;">then</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; log <span style="color: #000000;">&#40;</span><span style="color: #009900;">&quot;OCR file found for &quot;</span> <span style="color: #000000;">&amp;</span> posixBaseName<span style="color: #000000;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">end</span> <span style="color: #ff0033; font-weight: bold;">if</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">tell</span> <span style="color: #0066ff;">me</span> <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #ff0033; font-weight: bold;">set</span> fileCreator <span style="color: #ff0033; font-weight: bold;">to</span> getSpotlightInfo for <span style="color: #009900;">&quot;kMDItemCreator&quot;</span> <span style="color: #ff0033; font-weight: bold;">from</span> posixFilePath<br />
&nbsp; &nbsp; &nbsp; &nbsp; log <span style="color: #000000;">&#40;</span><span style="color: #009900;">&quot;Creator: &quot;</span> <span style="color: #000000;">&amp;</span> fileCreator<span style="color: #000000;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">if</span> <span style="color: #ff0033;">not</span> ocrFileExists <span style="color: #ff0033;">and</span> fileCreator <span style="color: #000000;">=</span> <span style="color: #009900;">&quot;ScanSnap Manager&quot;</span> <span style="color: #ff0033; font-weight: bold;">then</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">return</span> <span style="color: #0066ff;">POSIX file</span> posixFilePath<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">end</span> <span style="color: #ff0033; font-weight: bold;">if</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">end</span> <span style="color: #ff0033; font-weight: bold;">repeat</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">return</span> <span style="color: #009900;">&quot;&quot;</span><br />
<span style="color: #ff0033; font-weight: bold;">end</span> getNextFile<br />
<span style="color: #808080; font-style: italic;">(*<br />
Name: getSpotlightInfo<br />
Description: Gets a named attribute from metadata for a specific file.<br />
Parameters:<br />
&nbsp; &nbsp; for myattribute - the name of the attribute<br />
&nbsp; &nbsp; from myfile - the name of the file<br />
Returns: the attribute value or &quot;&quot; if none found<br />
*)</span><br />
<span style="color: #ff0033; font-weight: bold;">on</span> getSpotlightInfo for myattribute <span style="color: #ff0033; font-weight: bold;">from</span> myfile<br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">try</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> this_kMDItemResult <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #009900;">&quot;&quot;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">tell</span> <span style="color: #0066ff;">application</span> <span style="color: #009900;">&quot;Finder&quot;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> this_item <span style="color: #ff0033; font-weight: bold;">to</span> myfile <span style="color: #ff0033;">as</span> <span style="color: #0066ff;">string</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> this_item <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #0066ff;">POSIX path</span> <span style="color: #ff0033; font-weight: bold;">of</span> this_item<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> this_kMDItem <span style="color: #ff0033; font-weight: bold;">to</span> myattribute<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> theResult <span style="color: #ff0033; font-weight: bold;">to</span> words <span style="color: #ff0033; font-weight: bold;">of</span> <span style="color: #000000;">&#40;</span><span style="color: #0066ff;">do shell script</span> <span style="color: #009900;">&quot;/usr/bin/mdls -name &quot;</span> <span style="color: #000000;">&amp;</span> this_kMDItem <span style="color: #000000;">&amp;</span> <span style="color: #009900;">&quot; -raw -nullMarker None &quot;</span> <span style="color: #000000;">&amp;</span> <span style="color: #0066ff;">quoted form</span> <span style="color: #ff0033; font-weight: bold;">of</span> this_item<span style="color: #000000;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; log <span style="color: #009900;">&quot;Result: &quot;</span> <span style="color: #000000;">&amp;</span> theResult <span style="color: #ff0033;">as</span> <span style="color: #0066ff;">string</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">repeat</span> <span style="color: #ff0033; font-weight: bold;">with</span> j <span style="color: #ff0033; font-weight: bold;">from</span> <span style="color: #000000;">1</span> <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #0066ff;">number</span> <span style="color: #ff0033; font-weight: bold;">of</span> <span style="color: #0066ff;">items</span> <span style="color: #ff0033; font-weight: bold;">in</span> theResult<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> this_kMDItemResult <span style="color: #ff0033; font-weight: bold;">to</span> this_kMDItemResult <span style="color: #000000;">&amp;</span> <span style="color: #0066ff;">item</span> j <span style="color: #ff0033; font-weight: bold;">of</span> theResult <span style="color: #ff0033;">as</span> <span style="color: #0066ff;">string</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">if</span> j <span style="color: #000000;">&lt;</span> <span style="color: #0066ff;">number</span> <span style="color: #ff0033; font-weight: bold;">of</span> <span style="color: #0066ff;">items</span> <span style="color: #ff0033; font-weight: bold;">in</span> theResult <span style="color: #ff0033; font-weight: bold;">then</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> this_kMDItemResult <span style="color: #ff0033; font-weight: bold;">to</span> this_kMDItemResult <span style="color: #000000;">&amp;</span> <span style="color: #009900;">&quot; &quot;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">end</span> <span style="color: #ff0033; font-weight: bold;">if</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">end</span> <span style="color: #ff0033; font-weight: bold;">repeat</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">end</span> <span style="color: #ff0033; font-weight: bold;">tell</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">on</span> <span style="color: #ff0033; font-weight: bold;">error</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> this_kMDItemResult <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #009900;">&quot;&quot;</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">end</span> <span style="color: #ff0033; font-weight: bold;">try</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">return</span> this_kMDItemResult<br />
<span style="color: #ff0033; font-weight: bold;">end</span> getSpotlightInfo<br />
<span style="color: #808080; font-style: italic;">(*<br />
Name: waitForFineReaderFinish<br />
Description: Waits until FineReader OCR is complete.<br />
Returns: True if FineReader OCR is complete; otherwise False<br />
<br />
This procedure constantly loops through open FineReader windows looking<br />
for the window called &quot;Converting the Document&quot;<br />
Once that window goes away, the procedure exits.<br />
*)</span><br />
<span style="color: #ff0033; font-weight: bold;">on</span> waitForFineReaderFinish<span style="color: #000000;">&#40;</span><span style="color: #000000;">&#41;</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">if</span> <span style="color: #ff0033;">not</span> appIsRunning<span style="color: #000000;">&#40;</span><span style="color: #009900;">&quot;FineReader for ScanSnap&quot;</span><span style="color: #000000;">&#41;</span> <span style="color: #ff0033; font-weight: bold;">then</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">return</span> <span style="color: #0066ff;">false</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">end</span> <span style="color: #ff0033; font-weight: bold;">if</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">tell</span> <span style="color: #0066ff;">application</span> <span style="color: #009900;">&quot;System Events&quot;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> window_found <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #0066ff;">true</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">repeat</span> <span style="color: #ff0033; font-weight: bold;">until</span> <span style="color: #ff0033;">not</span> window_found<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> ew <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #0066ff;">name</span> <span style="color: #ff0033; font-weight: bold;">of</span> <span style="color: #ff0033;">every</span> <span style="color: #0066ff;">window</span> <span style="color: #ff0033; font-weight: bold;">of</span> <span style="color: #0066ff;">application</span> process <span style="color: #009900;">&quot;FineReader for ScanSnap&quot;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">if</span> ew <span style="color: #ff0033;">contains</span> <span style="color: #009900;">&quot;Converting the Document&quot;</span> <span style="color: #ff0033; font-weight: bold;">then</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> window_found <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #0066ff;">true</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; delay <span style="color: #000000;">1</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">else</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> window_found <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #0066ff;">false</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">end</span> <span style="color: #ff0033; font-weight: bold;">if</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">end</span> <span style="color: #ff0033; font-weight: bold;">repeat</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">end</span> <span style="color: #ff0033; font-weight: bold;">tell</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">return</span> <span style="color: #0066ff;">true</span><br />
<span style="color: #ff0033; font-weight: bold;">end</span> waitForFineReaderFinish<br />
<span style="color: #808080; font-style: italic;">(*<br />
Name: waitForFineReaderStart<br />
Description: Waits until FineReader OCR has begun.<br />
Returns: True if FineReader OCR has started; otherwise False<br />
<br />
This procedure is used to give FineReader a moment to actually start<br />
chewing on a file. It simply waits for the &quot;Converting the Document&quot;<br />
window to appear.<br />
In order to avoid a permanent loop if FineReader doesn't<br />
start, this times out after 30 seconds.<br />
*)</span><br />
<span style="color: #ff0033; font-weight: bold;">on</span> waitForFineReaderStart<span style="color: #000000;">&#40;</span><span style="color: #000000;">&#41;</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">if</span> <span style="color: #ff0033;">not</span> appIsRunning<span style="color: #000000;">&#40;</span><span style="color: #009900;">&quot;FineReader for ScanSnap&quot;</span><span style="color: #000000;">&#41;</span> <span style="color: #ff0033; font-weight: bold;">then</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">return</span> <span style="color: #0066ff;">false</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">end</span> <span style="color: #ff0033; font-weight: bold;">if</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">with</span> <span style="color: #ff0033; font-weight: bold;">timeout</span> <span style="color: #ff0033; font-weight: bold;">of</span> <span style="color: #000000;">30</span> seconds<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">tell</span> <span style="color: #0066ff;">application</span> <span style="color: #009900;">&quot;System Events&quot;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> window_found <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #0066ff;">false</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">repeat</span> <span style="color: #ff0033; font-weight: bold;">until</span> window_found<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> ew <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #0066ff;">name</span> <span style="color: #ff0033; font-weight: bold;">of</span> <span style="color: #ff0033;">every</span> <span style="color: #0066ff;">window</span> <span style="color: #ff0033; font-weight: bold;">of</span> <span style="color: #0066ff;">application</span> process <span style="color: #009900;">&quot;FineReader for ScanSnap&quot;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">if</span> ew <span style="color: #ff0033;">contains</span> <span style="color: #009900;">&quot;Converting the Document&quot;</span> <span style="color: #ff0033; font-weight: bold;">then</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> window_found <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #0066ff;">true</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">else</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> window_found <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #0066ff;">false</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; delay <span style="color: #000000;">1</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">end</span> <span style="color: #ff0033; font-weight: bold;">if</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">end</span> <span style="color: #ff0033; font-weight: bold;">repeat</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">end</span> <span style="color: #ff0033; font-weight: bold;">tell</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">end</span> <span style="color: #ff0033; font-weight: bold;">timeout</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">return</span> <span style="color: #0066ff;">true</span><br />
<span style="color: #ff0033; font-weight: bold;">end</span> waitForFineReaderStart</div></td></tr></tbody></table></div>
<p><strong>Installation</strong></p>
<ul>
<li>Use the Script Editor to save this script as <strong>Run OCR on New Folder Items</strong> under <strong><em>User Home</em>/Library/Scripts/Folder Action Scripts</strong>You may have to create the <strong>Folder Action Scripts </strong>folder.</li>
<li>Now open a Finder window, control-click and choose <strong>More / Configure Folder Actions&#8230;</strong></li>
<li>Check the <strong>Enable Folder Actions</strong> checkbox, if not checked</li>
<li>Click the &#8220;+&#8221; in the bottom left</li>
<li>Select a folder and click <strong>Open</strong></li>
<li>Choose the script <strong>Run OCR on New Folder Items</strong> and click <strong>Attach</strong></li>
</ul>
<p><strong>Picky Details</strong></p>
<p>As you can see in the source code, there were several issues to address:</p>
<ul>
<li>I had to make sure the script didn&#8217;t step on itself. If FineReader was running, I would wait until it was ready before processing.</li>
<li>The script needed to determine which files had been processed already. This was handled fairly trivially by looking for a matching file with the <strong>processed by FineReader.pdf</strong> suffix. In other words, if I was looking at <strong>Scan001.pdf</strong>, I would see if there was a matching <strong>Scan001 processed by FineReader.pdf</strong> file.</li>
<li>Part of checking for a source file&#8217;s &#8220;buddy&#8221; was stripping off the PDF suffix. This was done in a hackish way by using a one-line shell script, at lines 106-107.</li>
<li>I thought it was important to verify that the source file was, indeed, a ScanSnap file—the FineReader will not process other PDF documents. This was done at lines 117-121 by looking at the Spotlight metadata for the Creator of the source file. That took some more shell scripting (133-154).</li>
<li>The actual work was done by a single line, line 63.</li>
</ul>
<p>The real work was fairly simple, while the bulk of the code was needed to polish pesky little details. Isn&#8217;t that the way code development often is?</p>
<p>If anyone has any improvements on my script, please let me know!</p>
]]></content:encoded>
			<wfw:commentRss>http://paperjammed.com/2009/08/29/automate-scansnap-ocr-process-on-your-mac-with-applescript/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
	</channel>
</rss>

