<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Paper Jammed &#187; Geeky</title>
	<atom:link href="http://paperjammed.com/tag/geeky/feed/" rel="self" type="application/rss+xml" />
	<link>http://paperjammed.com</link>
	<description>Has paper taken over your life?</description>
	<lastBuildDate>Wed, 30 Jun 2010 02:14:53 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>New life for an old PC—no geek card required</title>
		<link>http://paperjammed.com/2010/05/05/new-life-for-an-old-pc%e2%80%94no-geek-card-required/</link>
		<comments>http://paperjammed.com/2010/05/05/new-life-for-an-old-pc%e2%80%94no-geek-card-required/#comments</comments>
		<pubDate>Thu, 06 May 2010 01:52:22 +0000</pubDate>
		<dc:creator>Tad</dc:creator>
				<category><![CDATA[Paperless Life]]></category>
		<category><![CDATA[Software]]></category>
		<category><![CDATA[Backups]]></category>
		<category><![CDATA[Data Loss]]></category>
		<category><![CDATA[Geeky]]></category>
		<category><![CDATA[Good Sites]]></category>
		<category><![CDATA[Hardware]]></category>
		<category><![CDATA[Knowledge Management]]></category>
		<category><![CDATA[Networking]]></category>
		<category><![CDATA[Reviews]]></category>
		<category><![CDATA[Tips]]></category>

		<guid isPermaLink="false">http://paperjammed.com/?p=985</guid>
		<description><![CDATA[Do you still have an old machine kicking around in the basement or the back room, long forgotten?
For no cost and almost zero effort, you can set it up as a dedicated network appliance, using one of the many turnkey products from the open-source TurnKey Linux project.
I&#8217;m serious. You don&#8217;t need to know anything at [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignright size-medium wp-image-986" src="http://paperjammed.com/wp-content/uploads/2010/05/iStock_000004973496XSmall-200x300.jpg" alt="istockphoto.com" width="200" height="300" />Do you still have an old machine kicking around in the basement or the back room, long forgotten?<br />
For no cost and almost zero effort, you can set it up as a dedicated network appliance, using one of the many turnkey products from the open-source TurnKey Linux project.</p>
<p>I&#8217;m serious. You don&#8217;t need to know anything at all about Linux to use one of these. Just download the image, install, and you suddenly have a full featured NAS file server, or you might have a database or a source code repository.</p>
<p>Last year I wrote an article on <a href="http://paperjammed.com/2009/02/15/new-life-for-an-old-clunker/">how to set up a NAS device using Ubuntu Linux</a>. I have been a fan of Ubuntu since the start because it is a very easy distribution to install and configure. The down-side of using Linux has always been the fairly steep learning curve. Before you can get around to using the server, you need to get down in the weeds with configuration files and other stuff.</p>
<p>TurnKey Linux changes all of that.<span id="more-985"></span></p>
<p><strong>Painless Installation</strong></p>
<p>A few weeks back, I was setting up an aging PC as a standalone wiki server for a small office—this machine was going to provide a place for the office staff to document their procedures, how-tos, and other things.</p>
<p>I was about to set up an Ubuntu server, as I have done before many times, and install MoinMoin, like I did <a href="http://paperjammed.com/2009/10/12/why-not-try-a-personal-wiki-for-some-of-your-more-amorphous-notes/">some months back</a>. I remembered that it was a bit of a pain to get everything tweaked just right, so I did a quick check to see what kind of standalone wiki options were available online.</p>
<p>This is how I found TurnKey Linux. This project is all about single-purpose preconfigured Ubuntu server images.</p>
<p>One of those preconfigured images happens to be a <a href="http://www.turnkeylinux.org/mediawiki">MediaWiki appliance</a>—the wiki engine behind Wikipedia—and I was in business.</p>
<p>The installation took about fifteen minutes, with very little user interaction. I answered a few basic questions and the installer took over from there. As soon as the install was done, the machine rebooted and displayed a message on the monitor with the IP addresses where you can browse to from any other machine.</p>
<p><strong>Full Featured</strong></p>
<p>The work that has gone in to these appliances is amazing. In fifteen minutes I had installed a complex configuration that has the Apache, PHP, MySQL, MediaWiki core, as well as maintenance utilities such as a neat tool that provides a <span style="text-decoration: line-through;">Flash-based</span> pure-AJAX-based SSH command line in a remote browser (i.e. your browser becomes a terminal). Even someone with Linux experience would have to spend quite a bit of time fiddling around with different packages and configuration options in other to provide the same functionality that TurnKey gives you out of the box.</p>
<p>As with most open source projects, the documentation is about 80% complete, with deep detail in some areas, but leaving others fairly sparsely documented. But don&#8217;t let this deter you: in most cases users know how to use the product they are installing (e.g. MediaWiki) but don&#8217;t want the hassle of configuring it on Linux. That&#8217;s where TurnKey shines.</p>
<p><strong>Some Examples</strong></p>
<p>In minutes, you can set up a <a href="http://www.turnkeylinux.org/fileserver">NAS device</a>. If you want to try advanced content management in your office, try <a href="http://www.turnkeylinux.org/joomla">Joomla</a> or <a href="http://www.turnkeylinux.org/drupal6">Drupal</a>.</p>
<p>If you are working on a small project team and want to protect your source code, try <a href="http://www.turnkeylinux.org/redmine">Redmine</a> or <a href="http://www.turnkeylinux.org/trac">Trac</a> and do your bug tracking using <a href="http://www.turnkeylinux.org/bugzilla">Bugzilla</a>.</p>
<p>And while you are at it, you can document your organization&#8217;s working practices using a wiki such as <a href="http://www.turnkeylinux.org/moinmoin">MoinMoin</a> or <a href="http://www.turnkeylinux.org/mediawiki">MediaWiki</a>.</p>
<p><strong>Don&#8217;t forget to back it up!</strong></p>
<p>As with any computer, you should include your new TurnKey appliance in your backup strategy. The nice thing is that you don&#8217;t really need to care at all about backing up Linux or the other software; just back up the data. I don&#8217;t need to back up my entire MediaWiki machine; I just need to back up the database and image files. If anything goes wrong, you can rebuild the TurnKey appliance from scratch in minutes and then restore your data.</p>
<p>To save yourself some pain, keep notes on any small tweaks you made to the configuration.</p>
<p><strong>One Machine, One Purpose</strong></p>
<p>These disk images share common Ubuntu underpinnings, but they are referred to as Appliances because they turn your PC into a purpose-built appliance.</p>
<p>This means that if you want a content management system and you also want a ticket management system, you will need two old computers—not a rare commodity these days.</p>
<p>Take a look at <a href="http://www.turnkeylinux.org/">what they have to offer</a> and give TurnKey a shot—specialized software used in corporate environments is now within reach of small offices at the right price.</p>
]]></content:encoded>
			<wfw:commentRss>http://paperjammed.com/2010/05/05/new-life-for-an-old-pc%e2%80%94no-geek-card-required/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>A handful of sweet freebie tools to save the day</title>
		<link>http://paperjammed.com/2010/03/16/a-handful-of-sweet-freebie-tools-to-save-the-day/</link>
		<comments>http://paperjammed.com/2010/03/16/a-handful-of-sweet-freebie-tools-to-save-the-day/#comments</comments>
		<pubDate>Wed, 17 Mar 2010 03:31:14 +0000</pubDate>
		<dc:creator>Tad</dc:creator>
				<category><![CDATA[Searching and Indexing]]></category>
		<category><![CDATA[Software]]></category>
		<category><![CDATA[Workflow]]></category>
		<category><![CDATA[Geeky]]></category>
		<category><![CDATA[Macros]]></category>
		<category><![CDATA[PDF]]></category>
		<category><![CDATA[Scripting]]></category>
		<category><![CDATA[Tips]]></category>
		<category><![CDATA[Windows]]></category>

		<guid isPermaLink="false">http://paperjammed.com/?p=930</guid>
		<description><![CDATA[It so happens that my employer has made a most welcome decision to replace the aging creaky old Novell GroupWise mail software with Microsoft Outlook, joining the rest of the modern corporate world. Now, there is little love in my heart for GroupWise, but it does have one feature that the new Outlook configuration will [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignright size-medium wp-image-935" title="iStock_000000846660XSmall" src="http://paperjammed.com/wp-content/uploads/2010/03/iStock_000000846660XSmall-300x199.jpg" alt="" width="300" height="199" />It so happens that my employer has made a most welcome decision to replace the aging creaky old Novell GroupWise mail software with Microsoft Outlook, joining the rest of the modern corporate world. Now, there is little love in my heart for GroupWise, but it does have one feature that the new Outlook configuration will lack: you can keep as many emails as you want, just like Gmail.</p>
<p>The problem is this: with Outlook we will be limited to 1000 messages in our in-box; sadly, many of us have tens of thousands of emails in our old GroupWise mail. Even after a fairly rigorous slash and burn mission, hacking out all of the low hanging fruit, there will be many thousands remaining and I don&#8217;t want to lose that information. It might be useful to search and find how I set up a Zebra bar code printer in 2003, no?</p>
<p>A bundle of different freeware glue tools came to my rescue. Read on to hear about the toolset that has made it so I can keep those messages for years to come.<span id="more-930"></span></p>
<p><strong>Possible Solutions</strong></p>
<p>Right out of the gate, I began looking for ways to migrate messages from one mail client to the other. Some apps have this built right in, and if not, there are scripts and utilities out there to do this; but I was hampered by a few key facts:</p>
<ul>
<li>I have no control over the email clients and their configuration. Even if there is a menu option for exporting GroupWise messages from version 7.2, I&#8217;m stuck at 6.4 and cannot use that option.</li>
<li>GroupWise is a minor player in the email world. I&#8217;m not sure if Outlook would import from GroupWise, but I doubt it.</li>
<li>They are <em>replacing</em> the client in one shot. There will be no interim period where both GroupWise and Outlook will be available.</li>
<li>There is no getting around the hard limit of 1000 messages.</li>
<li>I don&#8217;t want to spend money on this.</li>
</ul>
<p>With these constraints in mind, I immediately thought about PDF documents. I then considered the following questions:</p>
<ul>
<li>How do I convert my email to PDF?</li>
<li>How can I do this automatically with thousands of emails?</li>
<li>Once I&#8217;m done, how do I search these documents?</li>
</ul>
<p>Here&#8217;s what I did:</p>
<p><strong>Conversion to PDF</strong></p>
<p>The first part was easy. I downloaded one of the many free print-to-PDF products available.</p>
<p>I chose <a href="http://sourceforge.net/projects/pdfcreator/">PDFCreator</a>, because I am familiar with its use and I know that it <a href="http://paperjammed.com/2009/10/27/dodged-the-corrupt-document-bullet-this-time-just-barely/">does not munge the fonts</a>.</p>
<p>Like many other PDF generation utilities, PDFCreator functions by providing a virtual printer to which any application can print. For example, to make a PDF of a web page, you use the Firefox <strong>Print</strong> menu and select <strong>PDFCreator</strong> from the drop-down list of available printers.</p>
<p>You are provided with a list of metadata fields that you can fill in, and these fields are used in the PDF generation.</p>
<p>Here&#8217;s what the PDFCreator screen looks like:</p>
<p><img class="alignnone size-full wp-image-931" title="20100316-pdfcreator1" src="http://paperjammed.com/wp-content/uploads/2010/03/20100316-pdfcreator1.gif" alt="" width="500" height="367" /></p>
<p><strong>A word of caution:</strong> PDF Creator is free, but you must be careful to deselect their spammy toolbar options in two different places during the installation process. I don&#8217;t like software that comes with preselected toolbars to install (even nice ones like Google&#8217;s) because I&#8217;m certain that 95% of the folks who actually install the toolbar would never have chosen to do so if it were unchecked by default.</p>
<p><strong>Running Everything Automatically</strong></p>
<p>This was the interesting bit. I work with Windows machines at work, so there was no AppleScript option available. So I did the next best thing: I used <a href="http://www.autoitscript.com/autoit3/index.shtml">AutoIT</a>.</p>
<p>I will warn you that AutoIT is pretty much the Windows analog of AppleScript, without the cutesy pseudo English syntax. In other words, you will need to roll up your sleeves and get your hands a little dirty in order to put together a decent AutoIT script.</p>
<p>The payoff comes when you finish your work and compile it into a tight executable that you can share with your friends, allowing them to automate some complex series of button clicks and copy/paste operations.</p>
<p>I walked through the manual process of exporting an email to PDF and listed each action:</p>
<ul>
<li>Get the date, sender, and subject</li>
<li>Create a filename based on date + sender + subject</li>
<li>Launch the <strong>Print</strong> dialog</li>
<li>Select <strong>PDFCreator</strong></li>
<li>Fill in the <strong>Document Title</strong>, <strong>Creation Date</strong>, and <strong>Subject</strong> in the PDFCreator dialog</li>
<li>Fill in the full file path in the Save dialog</li>
</ul>
<p>In addition, I wanted to make the script a little better by adding the following:</p>
<ul>
<li>Check that user has PDFCreator installed</li>
<li>Verify that GroupWise is running and that the user has selected one or more messages</li>
<li>Prompt the user for a target directory before processing the messages</li>
<li>Sanitize the filenames by replacing illegal characters with underscores and truncating to meet maximum filename and path length in Windows</li>
<li>Skip over files that have already been generated, quickly, so that one doesn&#8217;t need to worry about accidentally selecting messages that were already printed</li>
</ul>
<p>There were other adjustments needed, but the process was the same: run the script, hit a problem, tweak the script a little to address the problem, and repeat.</p>
<p>Here&#8217;s a little bit of the AutoIT script:</p>
<p><img class="size-full wp-image-943 alignnone" title="20100316-autoit" src="http://paperjammed.com/wp-content/uploads/2010/03/20100316-autoit.gif" alt="" width="500" height="345" /></p>
<p>You can see that it is a bit more intense than AppleScript, but remember that the full script wasn&#8217;t written in one go. I had a little short ten-line script that I kept tweaking as small problems cropped up until I had adjusted things to my liking.</p>
<p>Note that this is a GUI macro language. The machine starts clicking and typing away right in front of you and you probably shouldn&#8217;t interfere until your script finishes.</p>
<p>As of this afternoon, I have generated around 4,000 PDF documents for my email messages.</p>
<p><strong>Searching All of Those Documents</strong></p>
<p>This was the easiest part. These days there is an excellent tool available for searching documents on your desktop: <a href="http://desktop.google.com/">Google Desktop</a>. This product indexes every useful file on your desktop and provides a full Google search with a quick double-tap of the &lt;control&gt; key.</p>
<p>So you can enter a search like &#8220;Zebra bar code&#8221;</p>
<p><img class="alignnone size-full wp-image-944" title="20100316-google1" src="http://paperjammed.com/wp-content/uploads/2010/03/20100316-google1.gif" alt="" width="300" height="205" /></p>
<p>And the results look exactly like a Google web search, but it&#8217;s showing your desktop files. And you can see inline previews too.</p>
<p><img class="alignnone size-full wp-image-945" title="20100316-google2" src="http://paperjammed.com/wp-content/uploads/2010/03/20100316-google2.gif" alt="" width="500" height="443" /></p>
<p>Macintosh users can install Google Desktop as well, but all of these files should already be indexed and searchable by Spotlight.</p>
<p><strong>Closing Thoughts</strong></p>
<p>Whenever I reach for tools like this I feel a twinge of guilt—it&#8217;s outright hackery, isn&#8217;t it?</p>
<p>But there is a place for quick and dirty jobs in every workplace. I needed to get my files from one place to another, one time only. It just didn&#8217;t make sense to spend money or time on a more elegant solution.</p>
<p>Play around with each of these tools a little. Especially AutoIT—it&#8217;s a handy Swiss Army Knife to have at your disposal.</p>
]]></content:encoded>
			<wfw:commentRss>http://paperjammed.com/2010/03/16/a-handful-of-sweet-freebie-tools-to-save-the-day/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Automate ScanSnap OCR process on your Mac with AppleScript (Snow Leopard Edition)</title>
		<link>http://paperjammed.com/2010/01/04/automate-scansnap-ocr-process-on-your-mac-with-applescript-snow-leopard-edition/</link>
		<comments>http://paperjammed.com/2010/01/04/automate-scansnap-ocr-process-on-your-mac-with-applescript-snow-leopard-edition/#comments</comments>
		<pubDate>Tue, 05 Jan 2010 01:51:52 +0000</pubDate>
		<dc:creator>Tad</dc:creator>
				<category><![CDATA[Software]]></category>
		<category><![CDATA[Workflow]]></category>
		<category><![CDATA[Geeky]]></category>
		<category><![CDATA[PDF]]></category>
		<category><![CDATA[Scanning]]></category>
		<category><![CDATA[Scripting]]></category>
		<category><![CDATA[Searching and Indexing]]></category>

		<guid isPermaLink="false">http://paperjammed.com/?p=840</guid>
		<description><![CDATA[Some time back I published an AppleScript that allows one to automatically run OCR in the background on scanned files generated by your Fujitsu ScanSnap, while you to continue scanning more files. ScanSnap owners should all be familiar with this: the out-of-the-box configuration of the ScanSnap Manager and Abbyy Finereader force the scan and OCR [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://paperjammed.com/wp-content/uploads/2009/08/20090829-applescript.gif"><img class="alignright size-full wp-image-658" title="20090829-applescript" src="http://paperjammed.com/wp-content/uploads/2009/08/20090829-applescript.gif" alt="" width="128" height="128" /></a>Some time back I published an AppleScript that allows one to <a href="http://paperjammed.com/2009/08/29/automate-scansnap-ocr-process-on-your-mac-with-applescript/">automatically run OCR in the background on scanned files</a> generated by your Fujitsu ScanSnap, while you to continue scanning more files. ScanSnap owners should all be familiar with this: the out-of-the-box configuration of the ScanSnap Manager and Abbyy Finereader force the scan and OCR stages to run in lockstep: scan 1&#8230;OCR 1&#8230;scan 2&#8230;OCR 2&#8230; and so on. This script allowed you to scan regardless of the OCR processing going on.</p>
<p>As it turns out, my original script does not work in Snow Leopard, and I promised that I would one day clean up and publish my new and improved version.</p>
<p>Chris posted a comment today as a gentle reminder, so here is the new and improved version without further delay&#8230;<br />
<span id="more-840"></span><br />
<strong>The Details</strong></p>
<p>Unfortunately, Snow Leopard came around <a href="http://paperjammed.com/2009/09/07/when-migrating-to-a-new-operating-system-look-before-you-leap/">and caused some indigestion</a>. For starters, the ScanSnap Manager didn&#8217;t work correctly and Abbyy Finereader would not process anything made by the ScanSnap. A couple of months later <a href="http://paperjammed.com/2009/11/13/snow-leopard-update-for-scansnap/">they got everything straightened out</a> and delivered <a href="http://www.fujitsu.com/us/services/computing/peripherals/scanners/support/sl_download.html">new versions of each product</a>.</p>
<p>The new version of the Abbyy Finereader product does not play well with my original script.</p>
<p>Since I cannot do without this important functionality, I rolled up my sleeves and rewrote most of the script. The new version works in Snow Leopard quite nicely with one small annoyance: you really don&#8217;t want to try to use the machine for anything other than scanning or OCR while it is going because the new Finereader version keeps bouncing the darned icon all the time it is running and that is quite annoying to watch.</p>
<p>Fortunately, I really don&#8217;t need to use my machine for anything else while it is chewing on the docs; I just wanted to be able to continue scanning at the same time!</p>
<p><strong>Note: </strong>Before going forward, note that you will need to upgrade the ScanSnap Manager and Abbyy Finereader to the Snow Leopard versions first! Get the files <a href="http://www.fujitsu.com/us/services/computing/peripherals/scanners/support/sl_download.html">here</a>.</p>
<p>Here is a link to the <a href="http://paperjammed.com/wp-content/uploads/2010/01/Run-OCR-on-New-Folder-Items.scpt">new script</a>&#8230;</p>
<p>And here&#8217;s the code itself:</p>
<div class="codecolorer-container applescript default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:435px;height:300px;"><table cellspacing="0" cellpadding="0"><tbody><tr><td style="padding:5px;text-align:center;color:#888888;background-color:#EEEEEE;border-right: 1px solid #9F9F9F;font: normal 12px/1.4em Monaco, Lucida Console, monospace;"><div>1<br />2<br />3<br />4<br />5<br />6<br />7<br />8<br />9<br />10<br />11<br />12<br />13<br />14<br />15<br />16<br />17<br />18<br />19<br />20<br />21<br />22<br />23<br />24<br />25<br />26<br />27<br />28<br />29<br />30<br />31<br />32<br />33<br />34<br />35<br />36<br />37<br />38<br />39<br />40<br />41<br />42<br />43<br />44<br />45<br />46<br />47<br />48<br />49<br />50<br />51<br />52<br />53<br />54<br />55<br />56<br />57<br />58<br />59<br />60<br />61<br />62<br />63<br />64<br />65<br />66<br />67<br />68<br />69<br />70<br />71<br />72<br />73<br />74<br />75<br />76<br />77<br />78<br />79<br />80<br />81<br />82<br />83<br />84<br />85<br />86<br />87<br />88<br />89<br />90<br />91<br />92<br />93<br />94<br />95<br />96<br />97<br />98<br />99<br />100<br />101<br />102<br />103<br />104<br />105<br />106<br />107<br />108<br />109<br />110<br />111<br />112<br />113<br />114<br />115<br />116<br />117<br />118<br />119<br />120<br />121<br />122<br />123<br />124<br />125<br />126<br />127<br />128<br />129<br />130<br />131<br />132<br />133<br />134<br />135<br />136<br />137<br />138<br />139<br />140<br />141<br />142<br />143<br />144<br />145<br />146<br />147<br />148<br />149<br />150<br />151<br />152<br />153<br />154<br />155<br />156<br />157<br />158<br />159<br />160<br />161<br />162<br />163<br />164<br />165<br />166<br />167<br />168<br />169<br />170<br />171<br />172<br />173<br />174<br />175<br />176<br />177<br />178<br />179<br />180<br />181<br />182<br />183<br />184<br />185<br />186<br />187<br />188<br />189<br />190<br />191<br />192<br />193<br />194<br />195<br />196<br />197<br />198<br />199<br />200<br />201<br />202<br />203<br />204<br />205<br />206<br />207<br />208<br />209<br />210<br />211<br />212<br />213<br />214<br />215<br />216<br />217<br />218<br />219<br />220<br />221<br />222<br />223<br />224<br />225<br />226<br />227<br />228<br />229<br />230<br />231<br />232<br />233<br />234<br /></div></td><td><div class="applescript codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap"><span style="color: #808080; font-style: italic;">(*<br />
<br />
NOTE: This script was written for Snow Leopard. It may work<br />
on Leopard, but I never tried it.<br />
<br />
This is a folder listener script that will act as a queue, receiving<br />
PDF files from the ScanSnap scanner and feeding them, one by one, to<br />
the Abbyy FineReader OCR software.<br />
<br />
This allows you to keep scanning while the OCR job runs in the background<br />
on all of the unprocessed files.<br />
<br />
Why do we want to do this?<br />
<br />
The ScanSnap Manager software does not support this by default, so<br />
when you scan in a file, it sends it to FineReader for OCR. You then<br />
must wait until FineReader finishes its work before scanning in another<br />
document.<br />
<br />
This script allows you to keep scanning without waiting for OCR.<br />
<br />
Installation:<br />
<br />
o &nbsp; Copy this script to:<br />
<br />
&nbsp; &nbsp; &lt;home&gt;/Library/Scripts/Folder Action Scripts<br />
<br />
&nbsp; &nbsp; You may have to create the &quot;Folder Action Scripts&quot; folder.<br />
<br />
o &nbsp; Open a Finder window and navigate to the parent folder<br />
&nbsp; of the scanned documents folder.<br />
<br />
o Right click (control-click) the scanned documents folder and<br />
&nbsp; choose:<br />
<br />
&nbsp; &nbsp; Folder Actions Setup...<br />
<br />
o At this point if folder actions are not enabled, you will<br />
&nbsp; likely have to enable them and add the script manually.<br />
&nbsp; &nbsp; - check &quot;Enable Folder Actions&quot;<br />
&nbsp; &nbsp; - Use the &quot;+&quot; buttons on the left and right sides to add the<br />
&nbsp; &nbsp; &nbsp; scan folder and then this script.<br />
&nbsp; &nbsp; <br />
o Otherwise, a list of scripts will come up. Choose this script<br />
&nbsp; from the &quot;Choose a Script to Attach&quot; dialog.<br />
<br />
o Close all windows.<br />
<br />
Copyright (C) 2010 Tad Harrison<br />
*)</span><br />
<span style="color: #ff0033; font-weight: bold;">property</span> ocrFileSuffix : <span style="color: #009900;">&quot; processed by FineReader.pdf&quot;</span><br />
<span style="color: #ff0033; font-weight: bold;">property</span> ocrApplicationName : <span style="color: #009900;">&quot;Scan to Searchable PDF&quot;</span><br />
<span style="color: #ff0033; font-weight: bold;">property</span> ocrApplicationWindow : <span style="color: #009900;">&quot;Converting the document&quot;</span><br />
<span style="color: #ff0033; font-weight: bold;">property</span> ocrLockFileName : <span style="color: #009900;">&quot;OCR in Progress&quot;</span><br />
<span style="color: #ff0033; font-weight: bold;">on</span> <span style="color: #0066ff;">adding</span> <span style="color: #0066ff;">folder</span> <span style="color: #0066ff;">items</span> <span style="color: #ff0033; font-weight: bold;">to</span> this_folder <span style="color: #ff0033;">after</span> <span style="color: #0066ff;">receiving</span> added_items<br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> lockFilePath <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #000000;">&#40;</span><span style="color: #0066ff;">POSIX path</span> <span style="color: #ff0033; font-weight: bold;">of</span> <span style="color: #000000;">&#40;</span><span style="color: #0066ff;">path to</span> <span style="color: #0066ff;">desktop</span> <span style="color: #0066ff;">folder</span> <span style="color: #ff0033;">as</span> <span style="color: #0066ff;">text</span><span style="color: #000000;">&#41;</span><span style="color: #000000;">&#41;</span> <span style="color: #000000;">&amp;</span> ocrLockFileName<br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">try</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; logEvent<span style="color: #000000;">&#40;</span><span style="color: #009900;">&quot;=== Run OCR on New Folder Items ===&quot;</span><span style="color: #000000;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;">-- Test for lockfile; exit if lockfile exists</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">tell</span> <span style="color: #0066ff;">application</span> <span style="color: #009900;">&quot;System Events&quot;</span> <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #ff0033; font-weight: bold;">set</span> lockFileExists <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #0066ff;">exists</span> <span style="color: #0066ff;">file</span> lockFilePath<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">if</span> lockFileExists <span style="color: #ff0033; font-weight: bold;">then</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; logEvent<span style="color: #000000;">&#40;</span><span style="color: #009900;">&quot;Other script running. Exiting...&quot;</span><span style="color: #000000;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">return</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">else</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #0066ff;">do shell script</span> <span style="color: #009900;">&quot;/usr/bin/touch <span style="color: #000000; font-weight: bold;">\&quot;</span>&quot;</span> <span style="color: #000000;">&amp;</span> lockFilePath <span style="color: #000000;">&amp;</span> <span style="color: #009900;">&quot;<span style="color: #000000; font-weight: bold;">\&quot;</span>&quot;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">end</span> <span style="color: #ff0033; font-weight: bold;">if</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;">-- Main loop</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> moreWorkToDo <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #0066ff;">true</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">repeat</span> <span style="color: #ff0033; font-weight: bold;">while</span> moreWorkToDo<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> aFile <span style="color: #ff0033; font-weight: bold;">to</span> getNextFile<span style="color: #000000;">&#40;</span>this_folder<span style="color: #000000;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">if</span> <span style="color: #ff0033;">not</span> aFile <span style="color: #000000;">=</span> <span style="color: #009900;">&quot;&quot;</span> <span style="color: #ff0033; font-weight: bold;">then</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; ocrFile<span style="color: #000000;">&#40;</span>aFile<span style="color: #000000;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">else</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> moreWorkToDo <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #0066ff;">false</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">end</span> <span style="color: #ff0033; font-weight: bold;">if</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">end</span> <span style="color: #ff0033; font-weight: bold;">repeat</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; logEvent<span style="color: #000000;">&#40;</span><span style="color: #009900;">&quot;No more work.&quot;</span><span style="color: #000000;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; exitApp<span style="color: #000000;">&#40;</span>ocrApplicationName<span style="color: #000000;">&#41;</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">on</span> <span style="color: #ff0033; font-weight: bold;">error</span> errorStr <span style="color: #0066ff;">number</span> errNum<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #0066ff;">display dialog</span> <span style="color: #009900;">&quot;Error &quot;</span> <span style="color: #000000;">&amp;</span> errNum <span style="color: #000000;">&amp;</span> <span style="color: #009900;">&quot; while running OCR: &quot;</span> <span style="color: #000000;">&amp;</span> errorStr<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> <span style="color: #ff0033; font-weight: bold;">my</span> isRunning <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #0066ff;">false</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">end</span> <span style="color: #ff0033; font-weight: bold;">try</span><br />
&nbsp; &nbsp; <span style="color: #808080; font-style: italic;">-- Get rid of the lockfile, ignoring any errors</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">try</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #0066ff;">do shell script</span> <span style="color: #009900;">&quot;/bin/rm <span style="color: #000000; font-weight: bold;">\&quot;</span>&quot;</span> <span style="color: #000000;">&amp;</span> lockFilePath <span style="color: #000000;">&amp;</span> <span style="color: #009900;">&quot;<span style="color: #000000; font-weight: bold;">\&quot;</span>&quot;</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">end</span> <span style="color: #ff0033; font-weight: bold;">try</span><br />
<span style="color: #ff0033; font-weight: bold;">end</span> <span style="color: #0066ff;">adding</span> <span style="color: #0066ff;">folder</span> <span style="color: #0066ff;">items</span> <span style="color: #ff0033; font-weight: bold;">to</span><br />
<span style="color: #808080; font-style: italic;">(*<br />
Name: ocrFile<br />
Description: Runs OCR on the next un-OCR'd file<br />
Parameters:<br />
&nbsp; aFile - the file to be OCR'd<br />
*)</span><br />
<span style="color: #ff0033; font-weight: bold;">on</span> ocrFile<span style="color: #000000;">&#40;</span>aFile<span style="color: #000000;">&#41;</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> posixFilePath <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #0066ff;">POSIX path</span> <span style="color: #ff0033; font-weight: bold;">of</span> aFile<br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> posixOcrFilePath <span style="color: #ff0033; font-weight: bold;">to</span> getPosixOcrFilePath<span style="color: #000000;">&#40;</span>posixFilePath<span style="color: #000000;">&#41;</span><br />
&nbsp; &nbsp; logEvent<span style="color: #000000;">&#40;</span><span style="color: #009900;">&quot;OCR: &quot;</span> <span style="color: #000000;">&amp;</span> posixFilePath<span style="color: #000000;">&#41;</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">tell</span> <span style="color: #0066ff;">application</span> ocrApplicationName <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #0066ff;">open</span> aFile<br />
&nbsp; &nbsp; <span style="color: #808080; font-style: italic;">--</span><br />
&nbsp; &nbsp; <span style="color: #808080; font-style: italic;">-- Now sit in a loop checking once per second for the OCR file</span><br />
&nbsp; &nbsp; <span style="color: #808080; font-style: italic;">-- Give up after five minutes</span><br />
&nbsp; &nbsp; <span style="color: #808080; font-style: italic;">--</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">with</span> <span style="color: #ff0033; font-weight: bold;">timeout</span> <span style="color: #ff0033; font-weight: bold;">of</span> <span style="color: #000000;">300</span> seconds<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> ocrFileExists <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #0066ff;">false</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">repeat</span> <span style="color: #ff0033; font-weight: bold;">until</span> ocrFileExists<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> ocrFileExists <span style="color: #ff0033; font-weight: bold;">to</span> posixFileExists<span style="color: #000000;">&#40;</span>posixOcrFilePath<span style="color: #000000;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">if</span> ocrFileExists <span style="color: #ff0033; font-weight: bold;">then</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; logEvent<span style="color: #000000;">&#40;</span><span style="color: #009900;">&quot;OCR file generated.&quot;</span><span style="color: #000000;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;">-- Wait 5 even if the file was found, to let things settle</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; delay <span style="color: #000000;">5</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">else</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;">-- Wait a second before checking again</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; delay <span style="color: #000000;">1</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">end</span> <span style="color: #ff0033; font-weight: bold;">if</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">end</span> <span style="color: #ff0033; font-weight: bold;">repeat</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">end</span> <span style="color: #ff0033; font-weight: bold;">timeout</span><br />
<span style="color: #ff0033; font-weight: bold;">end</span> ocrFile<br />
<span style="color: #808080; font-style: italic;">(*<br />
Name: appIsRunning<br />
Description: Determines if a particular application is running.<br />
Parameters:<br />
&nbsp; &nbsp; appName - the name of the application to be tested<br />
Returns: True if the application is running; otherwise False<br />
*)</span><br />
<span style="color: #ff0033; font-weight: bold;">on</span> appIsRunning<span style="color: #000000;">&#40;</span>appName<span style="color: #000000;">&#41;</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">tell</span> <span style="color: #0066ff;">application</span> <span style="color: #009900;">&quot;System Events&quot;</span> <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #000000;">&#40;</span><span style="color: #0066ff;">name</span> <span style="color: #ff0033; font-weight: bold;">of</span> processes<span style="color: #000000;">&#41;</span> <span style="color: #ff0033;">contains</span> appName<br />
<span style="color: #ff0033; font-weight: bold;">end</span> appIsRunning<br />
<span style="color: #808080; font-style: italic;">(*<br />
Name: posixFileExists<br />
Description: Determines if a particular file exists.<br />
Parameters:<br />
&nbsp; &nbsp; posixFilePath - the POSIX path to the file<br />
Returns: True if the file exists; otherwise False<br />
*)</span><br />
<span style="color: #ff0033; font-weight: bold;">on</span> posixFileExists<span style="color: #000000;">&#40;</span>posixFilePath<span style="color: #000000;">&#41;</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">tell</span> <span style="color: #0066ff;">application</span> <span style="color: #009900;">&quot;System Events&quot;</span> <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #0066ff;">exists</span> <span style="color: #0066ff;">file</span> posixFilePath<br />
<span style="color: #ff0033; font-weight: bold;">end</span> posixFileExists<br />
<span style="color: #808080; font-style: italic;">(*<br />
Name: exitApp<br />
Description: Exits the specified app if it is running.<br />
Parameters:<br />
&nbsp; &nbsp; appName - the application name<br />
*)</span><br />
<span style="color: #ff0033; font-weight: bold;">on</span> exitApp<span style="color: #000000;">&#40;</span>appName<span style="color: #000000;">&#41;</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">if</span> appIsRunning<span style="color: #000000;">&#40;</span>appName<span style="color: #000000;">&#41;</span> <span style="color: #ff0033; font-weight: bold;">then</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">tell</span> <span style="color: #0066ff;">application</span> appName <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #0066ff;">quit</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">end</span> <span style="color: #ff0033; font-weight: bold;">if</span><br />
<span style="color: #ff0033; font-weight: bold;">end</span> exitApp<br />
<span style="color: #808080; font-style: italic;">(*<br />
Name: getPosixOcrFilePath<br />
Description: Gets the OCR output filename for a given input filename.<br />
Parameters:<br />
&nbsp; &nbsp; posixFilePath - the full path to the source file<br />
Return: the POSIX path of the OCR output file<br />
*)</span><br />
<span style="color: #ff0033; font-weight: bold;">on</span> getPosixOcrFilePath<span style="color: #000000;">&#40;</span>posixFilePath<span style="color: #000000;">&#41;</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> posixBaseName <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #0066ff;">do shell script</span> ¬<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #009900;">&quot;filename=&quot;</span> <span style="color: #000000;">&amp;</span> <span style="color: #0066ff;">quoted form</span> <span style="color: #ff0033; font-weight: bold;">of</span> posixFilePath <span style="color: #000000;">&amp;</span> <span style="color: #009900;">&quot;; echo ${filename%<span style="color: #000000; font-weight: bold;">\\</span>.*}&quot;</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> posixOcrFilePath <span style="color: #ff0033; font-weight: bold;">to</span> posixBaseName <span style="color: #000000;">&amp;</span> ocrFileSuffix<br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">return</span> posixOcrFilePath<br />
<span style="color: #ff0033; font-weight: bold;">end</span> getPosixOcrFilePath<br />
<span style="color: #808080; font-style: italic;">(*<br />
Name: getNextFile<br />
Description: Finds the next unprocessed ScanSnap PDF<br />
Return: the file or &quot;&quot;<br />
*)</span><br />
<span style="color: #ff0033; font-weight: bold;">on</span> getNextFile<span style="color: #000000;">&#40;</span>aFolder<span style="color: #000000;">&#41;</span><br />
&nbsp; &nbsp; logEvent<span style="color: #000000;">&#40;</span><span style="color: #009900;">&quot;Getting next file...&quot;</span><span style="color: #000000;">&#41;</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> masterFileList <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #0066ff;">list</span> <span style="color: #0066ff;">folder</span> aFolder ¬<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">without</span> <span style="color: #0066ff;">invisibles</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> posixPath <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #0066ff;">POSIX path</span> <span style="color: #ff0033; font-weight: bold;">of</span> aFolder<br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">repeat</span> <span style="color: #ff0033; font-weight: bold;">with</span> i <span style="color: #ff0033; font-weight: bold;">from</span> <span style="color: #000000;">1</span> <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #0066ff;">count</span> masterFileList<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> fileName <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #0066ff;">item</span> i <span style="color: #ff0033; font-weight: bold;">of</span> masterFileList<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> posixFilePath <span style="color: #ff0033; font-weight: bold;">to</span> posixPath <span style="color: #000000;">&amp;</span> fileName<br />
&nbsp; &nbsp; &nbsp; &nbsp; log posixFilePath<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;">--</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;">-- Construct a FineReader file name from our file</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;">--</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> posixOcrFilePath <span style="color: #ff0033; font-weight: bold;">to</span> getPosixOcrFilePath<span style="color: #000000;">&#40;</span>posixFilePath<span style="color: #000000;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;">--</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;">-- See if the FineReader file we constructed exists</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;">--</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> ocrFileExists <span style="color: #ff0033; font-weight: bold;">to</span> posixFileExists<span style="color: #000000;">&#40;</span>posixOcrFilePath<span style="color: #000000;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">tell</span> <span style="color: #0066ff;">me</span> <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #ff0033; font-weight: bold;">set</span> fileCreator <span style="color: #ff0033; font-weight: bold;">to</span> getSpotlightInfo for <span style="color: #009900;">&quot;kMDItemCreator&quot;</span> <span style="color: #ff0033; font-weight: bold;">from</span> posixFilePath<br />
&nbsp; &nbsp; &nbsp; &nbsp; log <span style="color: #000000;">&#40;</span><span style="color: #009900;">&quot;Creator: &quot;</span> <span style="color: #000000;">&amp;</span> fileCreator<span style="color: #000000;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">if</span> <span style="color: #ff0033;">not</span> ocrFileExists <span style="color: #ff0033;">and</span> fileCreator <span style="color: #000000;">=</span> <span style="color: #009900;">&quot;ScanSnap Manager&quot;</span> <span style="color: #ff0033; font-weight: bold;">then</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">return</span> <span style="color: #0066ff;">POSIX file</span> posixFilePath<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">end</span> <span style="color: #ff0033; font-weight: bold;">if</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">end</span> <span style="color: #ff0033; font-weight: bold;">repeat</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">return</span> <span style="color: #009900;">&quot;&quot;</span><br />
<span style="color: #ff0033; font-weight: bold;">end</span> getNextFile<br />
<span style="color: #808080; font-style: italic;">(*<br />
Name: getSpotlightInfo<br />
Description: Gets a named attribute from metadata for a specific file.<br />
Parameters:<br />
&nbsp; &nbsp; for myattribute - the name of the attribute<br />
&nbsp; &nbsp; from myfile - the name of the file<br />
Returns: the attribute value or &quot;&quot; if none found<br />
*)</span><br />
<span style="color: #ff0033; font-weight: bold;">on</span> getSpotlightInfo for myattribute <span style="color: #ff0033; font-weight: bold;">from</span> myfile<br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">try</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> this_kMDItemResult <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #009900;">&quot;&quot;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">tell</span> <span style="color: #0066ff;">application</span> <span style="color: #009900;">&quot;Finder&quot;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> this_item <span style="color: #ff0033; font-weight: bold;">to</span> myfile <span style="color: #ff0033;">as</span> <span style="color: #0066ff;">string</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> this_item <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #0066ff;">POSIX path</span> <span style="color: #ff0033; font-weight: bold;">of</span> this_item<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> this_kMDItem <span style="color: #ff0033; font-weight: bold;">to</span> myattribute<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> theResult <span style="color: #ff0033; font-weight: bold;">to</span> words <span style="color: #ff0033; font-weight: bold;">of</span> <span style="color: #000000;">&#40;</span><span style="color: #0066ff;">do shell script</span> <span style="color: #009900;">&quot;/usr/bin/mdls -name &quot;</span> <span style="color: #000000;">&amp;</span> this_kMDItem <span style="color: #000000;">&amp;</span> <span style="color: #009900;">&quot; -raw -nullMarker None &quot;</span> <span style="color: #000000;">&amp;</span> <span style="color: #0066ff;">quoted form</span> <span style="color: #ff0033; font-weight: bold;">of</span> this_item<span style="color: #000000;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; log <span style="color: #009900;">&quot;Result: &quot;</span> <span style="color: #000000;">&amp;</span> theResult <span style="color: #ff0033;">as</span> <span style="color: #0066ff;">string</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">repeat</span> <span style="color: #ff0033; font-weight: bold;">with</span> j <span style="color: #ff0033; font-weight: bold;">from</span> <span style="color: #000000;">1</span> <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #0066ff;">number</span> <span style="color: #ff0033; font-weight: bold;">of</span> <span style="color: #0066ff;">items</span> <span style="color: #ff0033; font-weight: bold;">in</span> theResult<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> this_kMDItemResult <span style="color: #ff0033; font-weight: bold;">to</span> this_kMDItemResult <span style="color: #000000;">&amp;</span> <span style="color: #0066ff;">item</span> j <span style="color: #ff0033; font-weight: bold;">of</span> theResult <span style="color: #ff0033;">as</span> <span style="color: #0066ff;">string</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">if</span> j <span style="color: #000000;">&lt;</span> <span style="color: #0066ff;">number</span> <span style="color: #ff0033; font-weight: bold;">of</span> <span style="color: #0066ff;">items</span> <span style="color: #ff0033; font-weight: bold;">in</span> theResult <span style="color: #ff0033; font-weight: bold;">then</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> this_kMDItemResult <span style="color: #ff0033; font-weight: bold;">to</span> this_kMDItemResult <span style="color: #000000;">&amp;</span> <span style="color: #009900;">&quot; &quot;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">end</span> <span style="color: #ff0033; font-weight: bold;">if</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">end</span> <span style="color: #ff0033; font-weight: bold;">repeat</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">end</span> <span style="color: #ff0033; font-weight: bold;">tell</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">on</span> <span style="color: #ff0033; font-weight: bold;">error</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> this_kMDItemResult <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #009900;">&quot;&quot;</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">end</span> <span style="color: #ff0033; font-weight: bold;">try</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">return</span> this_kMDItemResult<br />
<span style="color: #ff0033; font-weight: bold;">end</span> getSpotlightInfo<br />
<span style="color: #808080; font-style: italic;">(*<br />
Name: logEvent<br />
Description: Write an event to an event log<br />
Parameters:<br />
&nbsp; &nbsp; themessage - the message to write to the log<br />
*)</span><br />
<span style="color: #ff0033; font-weight: bold;">on</span> logEvent<span style="color: #000000;">&#40;</span>themessage<span style="color: #000000;">&#41;</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> theLine <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #000000;">&#40;</span><span style="color: #0066ff;">do shell script</span> ¬<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #009900;">&quot;date &nbsp;+'%Y-%m-%d %H:%M:%S'&quot;</span> <span style="color: #ff0033;">as</span> <span style="color: #0066ff;">string</span><span style="color: #000000;">&#41;</span> ¬<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #000000;">&amp;</span> <span style="color: #009900;">&quot; &quot;</span> <span style="color: #000000;">&amp;</span> themessage<br />
&nbsp; &nbsp; <span style="color: #0066ff;">do shell script</span> <span style="color: #009900;">&quot;echo &quot;</span> <span style="color: #000000;">&amp;</span> theLine <span style="color: #000000;">&amp;</span> ¬<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #009900;">&quot; &gt;&gt; ~/Library/Logs/AppleScript-events.log&quot;</span><br />
<span style="color: #ff0033; font-weight: bold;">end</span> logEvent</div></td></tr></tbody></table></div>
<p><strong>Installation</strong></p>
<ul>
<li>Use the Script Editor to save this script as <strong>Run OCR on New Folder Items</strong> under <strong><em>User Home</em>/Library/Scripts/Folder Action Scripts</strong><br />
You may have to create the <strong>Folder Action Scripts</strong> folder.</li>
<li>Now open a Finder window and navigate to the parent folder of your scanned documents folder.</li>
<li>Right click (control-click) the scanned documents folder and choose <strong>Folder Actions Setup&#8230;</strong></li>
<li>At this point if folder actions are not enabled, you will likely have to enable them and add the script manually.
<ul>
<li> Check <strong>Enable Folder Actions</strong></li>
<li>Use the &#8220;+&#8221; buttons on the left and right sides to add the scan folder and then this script.</li>
</ul>
</li>
<li>Otherwise, a list of scripts will come up. Choose this script from the <strong>Choose a Script to Attach</strong> dialog.</li>
<li>Close all windows.</li>
</ul>
<p>That&#8217;s it! The script will be invoked automatically every time a new file appears in your scanned documents folder.</p>
<p>Please let me know if you have any ideas that can improve this script. I&#8217;m not an AppleScript guru, so someone might just know how to keep that annoying Finereader icon from jumping.</p>
]]></content:encoded>
			<wfw:commentRss>http://paperjammed.com/2010/01/04/automate-scansnap-ocr-process-on-your-mac-with-applescript-snow-leopard-edition/feed/</wfw:commentRss>
		<slash:comments>12</slash:comments>
		</item>
		<item>
		<title>Keeping your secrets to yourself—old changes lingering in your PDF files</title>
		<link>http://paperjammed.com/2009/11/23/keeping-your-secrets-to-yourself-old-changes-lingering-in-your-pdf-files/</link>
		<comments>http://paperjammed.com/2009/11/23/keeping-your-secrets-to-yourself-old-changes-lingering-in-your-pdf-files/#comments</comments>
		<pubDate>Tue, 24 Nov 2009 04:46:58 +0000</pubDate>
		<dc:creator>Tad</dc:creator>
				<category><![CDATA[Security]]></category>
		<category><![CDATA[Software]]></category>
		<category><![CDATA[Data Loss]]></category>
		<category><![CDATA[Geeky]]></category>
		<category><![CDATA[PDF]]></category>

		<guid isPermaLink="false">http://paperjammed.com/?p=781</guid>
		<description><![CDATA[A few months ago I wrote an article that touched upon the problems inherent in attempts to sanitize documents before sending them to the enemy—perhaps to remove competitor&#8217;s names or trade secrets.
I was reading a post on a board I frequent where a person was describing exactly this kind of activity—removing sensitive information from PDF [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignright size-medium wp-image-791" title="Rusty trap" src="http://paperjammed.com/wp-content/uploads/2009/11/iStock_000011076402XSmall-300x225.jpg" alt="Rusty trap" width="300" height="225" />A few months ago I wrote an article that touched upon <a href="http://paperjammed.com/2009/04/21/keeping-your-secrets-to-yourself—what-can-your-shared-documents-tell-others/">the problems inherent in attempts to sanitize documents</a> before sending them to the enemy—perhaps to remove competitor&#8217;s names or trade secrets.</p>
<p>I was reading a post on a board I frequent where a person was describing exactly this kind of activity—removing sensitive information from PDF documents. Several suggestions were made, but one individual suggested opening the file in Acrobat Pro and replacing the sensitive text with good old <a href="http://www.lipsum.com/">Lorem Ipsum</a>.</p>
<p>It was at that moment that I recalled a peculiar feature of the PDF file format: it is designed to support nondestructive updates, allowing people to make vast changes to a PDF document while still retaining the original document, fully intact. I did a few experiments and was surprised with the results.<span id="more-781"></span></p>
<p><strong>A Brief Note on the PDF File Format</strong></p>
<p>For the geeky types among us, one place to begin is this article:</p>
<p><a href="http://www.mactech.com/articles/mactech/Vol.15/15.09/PDFIntro/">Portable Document Format: An Introduction for Programmers</a></p>
<p>The key points to get out of the article is this: A PDF document is comprised of several distinct sections, a <strong>Header</strong>, a <strong>Body</strong>, an <strong>&#8220;xref&#8221; Table</strong>, and a <strong>Trailer</strong>. At the very end of the file you will find the character sequence <strong>%%EOF</strong></p>
<p>The PDF standard was designed to allow multiple updates to a document, while retaining the original version. This is accomplished by appending anything new to the end of the document, after the original <strong>EOF</strong> tag. The document will now have two <strong>EOF</strong> tags: one indicating where the original document ended, and a new <strong>EOF</strong> tag indicating where the new changes end.</p>
<p>If we wish to revert PDF changes, it should be a simple matter of opening the PDF file in a binary editor, searching for the first <strong>EOF</strong> tag, and deleting everything following.</p>
<p><strong>A Simple Experiment</strong></p>
<p>Let&#8217;s start with a proper secret document containing missile plans&#8230;</p>
<p><img class="alignnone size-full wp-image-785" title="20091123-missile-plans-1" src="http://paperjammed.com/wp-content/uploads/2009/11/20091123-missile-plans-1.gif" alt="20091123-missile-plans-1" width="439" height="418" /></p>
<p>Suppose we want to obscure some special information in paragraph 37. We can open the file in Acrobat Professional and use its text editing features to swap in the venerable <em>Lorem Ipsum</em> text.</p>
<p>Here&#8217;s what it looks like after the switch:</p>
<p><img class="alignnone size-full wp-image-786" title="20091123-lorem-ipsum" src="http://paperjammed.com/wp-content/uploads/2009/11/20091123-lorem-ipsum.gif" alt="20091123-lorem-ipsum" width="598" height="243" /></p>
<p>You can see here that the first seven lines of text starting on paragraph 37 have been replaced with appropriate unreadable text.</p>
<p>Now, open the new PDF file in a binary editor (since PDF files contain a mix of text and binary, the editor must be a binary editor).</p>
<p><img class="alignnone size-full wp-image-787" title="20091123-binary-editor" src="http://paperjammed.com/wp-content/uploads/2009/11/20091123-binary-editor.gif" alt="20091123-binary-editor" width="693" height="633" /></p>
<p>Note the <strong>%%EOF</strong> character sequence embedded in the text. This is the first <strong>EOF</strong> tag, indicating where the original file ended. All we need to do is place the cursor to the right of the <strong>EOF</strong> and delete everything to the end of the file.</p>
<p>Once we have done so, it&#8217;s like magic:</p>
<p><img class="alignnone size-full wp-image-788" title="20091123-after-binary-editing" src="http://paperjammed.com/wp-content/uploads/2009/11/20091123-after-binary-editing.gif" alt="20091123-after-binary-editing" width="794" height="323" /></p>
<p>The edits that replaced lines of paragraph 37 with gibberish have neatly been undone!</p>
<p><strong>More Details</strong></p>
<p>From the <a href="http://www.mactech.com/articles/mactech/Vol.15/15.09/PDFIntro/">PDF Intro document</a> linked earlier:</p>
<p>&#8220;The trailer, it turns out, plays an important role in the way PDF implements incremental updating. The key concept to understand here is that a PDF file is never overwritten, only added to. That goes for all portions of the PDF file &#8211; even the trailer itself, and the end-of-file marker. In other words, a multiply-updated PDF document may contain multiple trailers &#8211; and multiple end-of-file markers! (There may be numerous occurrences of %%EOF.) Each time the file is edited, an addendum is written to the tail of the file, consisting of the content objects that have changed, a new xref section, and a new trailer containing all the information that was in the previous trailer, as well as a /Prev key specifying the byte offset (from the beginning of the file) of the previous xref section. The cross-reference info will then be distributed across more than one xref section. To access all of the cross-references, the reader must walk the list of /Prev keys in all the trailers, in reverse order.</p>
<p>Space doesn&#8217;t permit a detailed exploration of updates here, but you can find several examples in Appendix A of the PDF 1.3 specification (available at <a href="http://partners.adobe.com/asn/developer">http://partners.adobe.com/asn/developer</a>).&#8221;</p>
<p><strong>Summary</strong></p>
<p>It is important to understand that the PDF standard allows for appended updates to files that leave the original document intact, regardless of how drastic the changes are. If you are intent on redacting text from PDF documents, do not depend on simply deleting the secrets using a PDF editor—you must use a proper redaction tool that addresses these issues correctly.</p>
<p>That said, I did some experimenting with a few utilities (Apple Preview, PDFpen, and Adobe Acrobat Pro) and found that some write the file from scratch each time, with no lingering cruft from former versions, while others respect the original intent of the PDF standard. This means that you can&#8217;t trust that older revisions are being retained in your file and you can&#8217;t trust that they aren&#8217;t.</p>
<p>Be conservative: use a redaction tool for secrecy and proper backups for versioning.</p>
]]></content:encoded>
			<wfw:commentRss>http://paperjammed.com/2009/11/23/keeping-your-secrets-to-yourself-old-changes-lingering-in-your-pdf-files/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Dodged the corrupt-document bullet this time, just barely&#8230;</title>
		<link>http://paperjammed.com/2009/10/27/dodged-the-corrupt-document-bullet-this-time-just-barely/</link>
		<comments>http://paperjammed.com/2009/10/27/dodged-the-corrupt-document-bullet-this-time-just-barely/#comments</comments>
		<pubDate>Tue, 27 Oct 2009 21:52:30 +0000</pubDate>
		<dc:creator>Tad</dc:creator>
				<category><![CDATA[Searching and Indexing]]></category>
		<category><![CDATA[Software]]></category>
		<category><![CDATA[Data Loss]]></category>
		<category><![CDATA[Geeky]]></category>
		<category><![CDATA[Indexing]]></category>
		<category><![CDATA[Knowledge Management]]></category>
		<category><![CDATA[PDF]]></category>

		<guid isPermaLink="false">http://paperjammed.com/?p=750</guid>
		<description><![CDATA[A couple of weeks ago, a co-worker sent me a PDF document to look at. He said that he was having trouble copying and pasting from the document and was scratching his head about why this particular PDF would have such issues.
As it would turn out, there were several thousand other documents on a file [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignright size-medium wp-image-751" title="gibberish document in a file folder" src="http://paperjammed.com/wp-content/uploads/2009/10/iStock_000006486654XSmall-300x199.jpg" alt="gibberish document in a file folder" width="300" height="199" />A couple of weeks ago, a co-worker sent me a PDF document to look at. He said that he was having trouble copying and pasting from the document and was scratching his head about why this particular PDF would have such issues.</p>
<p>As it would turn out, there were several thousand other documents on a file server that shared the same funny behavior. By the time we were done struggling with this problem I had gained new respect for PDF corruption issues and their prevention.<span id="more-750"></span></p>
<p><strong>The Problem</strong></p>
<p>We were looking to load a few thousand of these scientific reports into a fancy-schmancy new database, with linguistics searching and other bells and whistles. Much to our chagrin, these documents just weren&#8217;t loading, and we couldn&#8217;t understand why. They were text documents, with some embedded images, but mostly straightforward text.</p>
<p>Here is an excerpt:</p>
<p><img class="alignnone size-full wp-image-755" title="20091027-plaintext" src="http://paperjammed.com/wp-content/uploads/2009/10/20091027-plaintext.gif" alt="20091027-plaintext" width="521" height="93" /></p>
<p>And you can tell that it is right and proper text because when I blow it up all the way, the fonts are nice and smooth—this isn&#8217;t just an image of text.</p>
<p><img class="alignnone size-full wp-image-756" title="20091027-smooth-letter" src="http://paperjammed.com/wp-content/uploads/2009/10/20091027-smooth-letter.gif" alt="20091027-smooth-letter" width="258" height="295" /></p>
<p>But if I copy and paste that particular paragraph into any handy editor (Notepad, in this case), this is what I see:</p>
<p><img class="alignnone size-full wp-image-757" title="20091027-notepad" src="http://paperjammed.com/wp-content/uploads/2009/10/20091027-notepad.gif" alt="20091027-notepad" width="496" height="155" /></p>
<p>And as far as I know, at this point the actual text is beyond the reach of average folks like me. We tried, believe me we tried.</p>
<p><strong>What went wrong?</strong></p>
<p>A quick Google of the subject led us to understand that many PDF generation tools embed subsets of fonts, with nonstandard mappings from the text to the font.</p>
<p>This fellow explains it nicely:</p>
<p>&#8220;The PDF file does not contain all the information to extract the text. The problem is that a character in a PDF file may not contain information what &#8220;real&#8221; character it relates to. Some PDF generators do a pretty bad job when they embed fonts into PDF files. They use a proprietary encoding mechanism (e.g. 1 is A, 2 is B, 3 is C, &#8230;) in both the embedded font and when they place glyphs on the page. Without a table that implements the reverse (e.g. character code 1 is &#8216;A&#8217;) you cannot extract text from such a file.</p>
<p>There is nothing you can do (besides to complain to whoever created the PDF file, and the author of the software that created this file).&#8221;<br />
— from <a href="http://www.experts-exchange.com/Web_Development/Document_Imaging/Adobe_Acrobat/Q_21426533.html">khkremer on experts-exchange.com</a></p>
<p>As it would turn out, many of the reports had been generated by printing to Adobe Distiller from Microsoft Word. It would seem that the default settings used for Distiller included the &#8220;totally hose my document content&#8221; switch.</p>
<p><strong>The Solution</strong></p>
<p>We fretted over this quite a bit. These are important scientific reports, and there is no way to easily ungarble them. We finally ended up contacting the <a href="http://finereader.abbyy.com/">Abbyy Finereader</a> folks and trying out their OCR toolkit for Linux: not only did this product make fast work of running optical character recognition on the sample document, but once we had a script running, we managed to blow through the 10,000 pages the trial license gave us, in a day or two.</p>
<p><strong>Imperfect, at best</strong></p>
<p>I am happy that we were able to salvage the bulk of the electronic knowledge found within those thousands of files, but our work barely scratched the surface.</p>
<p>For example, most of these documents have rich bookmarking of sections and keywording, such as this (content tastefully blurred on purpose).</p>
<p><img class="alignnone size-full wp-image-760" title="20091027-doc-with-contents" src="http://paperjammed.com/wp-content/uploads/2009/10/20091027-doc-with-contents.gif" alt="20091027-doc-with-contents" width="500" height="348" /></p>
<p>In addition, scientific documents typically have loads of tables full of numbers. Though it is possible to mine this data with a good OCR tool (the FineReader API provides tools for just this purpose), the tables are far more difficult to extract correctly once the original text information is lost.</p>
<p><strong>Final thoughts</strong></p>
<p>I wrote a few weeks about document formats, <a href="http://paperjammed.com/2009/09/29/are-your-portable-document-format-files-all-that/">mentioning the PDF/A document standard</a>. This is worth investigating, regardless of what your document needs are.</p>
<p>If our thousands of files had been originally generated as PDF/A, it is certain that we would have been able to copy/paste from them without problem: PDF/A prohibits such font shenanigans as were perpetrated on our garbled reports.</p>
<p>In the end, our OCR sledgehammer approach worked like a charm, and is probably sufficient for our needs. Text mining is a pretty slushy business, so no-one will complain if there are a few typos on each page—if they find the doc in a search, they can print it and read it the old fashioned way.</p>
]]></content:encoded>
			<wfw:commentRss>http://paperjammed.com/2009/10/27/dodged-the-corrupt-document-bullet-this-time-just-barely/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Why not try a personal Wiki for some of your more amorphous notes?</title>
		<link>http://paperjammed.com/2009/10/12/why-not-try-a-personal-wiki-for-some-of-your-more-amorphous-notes/</link>
		<comments>http://paperjammed.com/2009/10/12/why-not-try-a-personal-wiki-for-some-of-your-more-amorphous-notes/#comments</comments>
		<pubDate>Tue, 13 Oct 2009 03:59:04 +0000</pubDate>
		<dc:creator>Tad</dc:creator>
				<category><![CDATA[Paperless Life]]></category>
		<category><![CDATA[Searching and Indexing]]></category>
		<category><![CDATA[Software]]></category>
		<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Geeky]]></category>
		<category><![CDATA[Knowledge Management]]></category>
		<category><![CDATA[Media]]></category>
		<category><![CDATA[Networking]]></category>
		<category><![CDATA[Tools of the Trade]]></category>
		<category><![CDATA[Windows]]></category>

		<guid isPermaLink="false">http://paperjammed.com/?p=706</guid>
		<description><![CDATA[In my evenings, I sometimes find myself performing the role of &#8220;Resident Geek&#8221; at my nephew&#8217;s school, tending to network issues, computer problems, and my favorite, &#8220;The Internet is down!&#8221;
Over the past couple of years I have considered several different approaches for keeping a grip on which computers had which service patch, which router is [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignright size-medium wp-image-736" src="http://paperjammed.com/wp-content/uploads/2009/10/iStock_000008986250XSmall-300x199.jpg" alt="" width="300" height="199" />In my evenings, I sometimes find myself performing the role of &#8220;Resident Geek&#8221; at my nephew&#8217;s school, tending to network issues, computer problems, and my favorite, &#8220;The Internet is down!&#8221;</p>
<p>Over the past couple of years I have considered several different approaches for keeping a grip on which computers had which service patch, which router is getting flaky, and which cable connects the library to the classroom at the end of the hall.</p>
<p>I have tried Excel spreadsheets, an Access database, even a spiral-bound notebook—none of them made the job any easier. A few weeks ago I thought about trying a <a href="http://en.wikipedia.org/wiki/Wiki">Wiki</a> and this has turned out to be a perfect fit!</p>
<p>If you are looking to keep a loose scrapbook of notes with lots of arbitrary categories and relationships between them, a wiki might do the trick. In this article I&#8217;ll cover two simple freeware wikis you can carry around on a thumb drive.<span id="more-706"></span></p>
<p><strong>What&#8217;s in a Wiki?</strong></p>
<p>All of us have used Wikipedia at one time or another, and though it may be regarded with disdain by high school teachers, when you consider how it works, Wikipedia is an amazing achievement. But what is the nature of a wiki?</p>
<p>One of the key features is that any page can be easily edited at any time (of course this can be limited by permissions). Another attribute is the ability to breathe life into a new page just by calling its name.</p>
<p>Between these two features, you get the essence of wiki-ness.</p>
<p>For example, if I have a page that discusses North American bears, I can type in a list of bears in a special format, often in jammed-together <a href="http://en.wikipedia.org/wiki/CamelCase">Wiki Words</a>, like this:</p>
<ul>
<li><span style="color: #3366ff;"><strong>GrizzleyBear</strong></span></li>
<li><span style="color: #3366ff;"><strong>BlackBear</strong></span></li>
<li><span style="color: #3366ff;"><strong>BrownBear</strong></span></li>
</ul>
<p>As soon as I save the page, those bear names become hyperlinks. Even though I haven&#8217;t written any pages about the individual bears, whenever it finally suits me, I can click on <span style="color: #3366ff;"><strong>BlackBear </strong></span>and accept the invitation to &#8220;Create a new page called <span style="color: #3366ff;"><strong>BlackBear</strong></span>&#8221;</p>
<p>Better still, a friend who knows about black bears might click on <span style="color: #3366ff;"><strong>BlackBear </strong></span>and write a beautiful page about the animals.</p>
<p>That&#8217;s what wikis are all about.</p>
<p><strong>Back to the School Computers</strong></p>
<p>In a matter of minutes I was able to make a page that described the building and listed the various rooms in the building. I was able to then click on each room and &#8220;auto-vivify&#8221; a page for the room.</p>
<p>From that point, it was easy to create custom pages for each computer in the building, with each page listing the machine&#8217;s stats. I also created pages for each network switch or router.</p>
<p>In a matter of two or three evenings I had the skeleton of a solid knowledge base populated—it&#8217;s a pretty fancy looking web site with dozens of pages that took little effort to put together.</p>
<p>Last night I noticed that one of the machines wasn&#8217;t connecting to the Internet, though it connects fine to internal servers. I popped open its page on the wiki and added a simple note at the bottom of the page:</p>
<p><tt>2009-10-11 - This machine isn't able to connect to the Internet. Not sure why. It connects fine to internal servers.</tt></p>
<p>A few weeks ago I replaced a fan in a network switch. An easy annotation on the wiki page for that device.</p>
<p><strong>Personal Wikis</strong></p>
<p>There are many uses for personal wikis, mostly centered around <a href="http://en.wikipedia.org/wiki/Personal_knowledge_management">personal knowledge management</a> and <a href="http://en.wikipedia.org/wiki/Personal_information_management">personal information management</a>. People use wikis as a replacement for time and task management tools, as a place for gathering thoughts, as a sort of amorphous database, and many other things.</p>
<p>There are many different personal wikis available—here&#8217;s a <a href="http://en.wikipedia.org/wiki/Personal_wiki#Free_software">short list of free ones</a>. One nice simple wiki to try is <a href="http://en.wikipedia.org/wiki/TiddlyWiki">TiddlyWiki</a>. If you are looking for something with a bit more substance, you can try a portable version of <a href="http://en.wikipedia.org/wiki/MediaWiki">MediaWiki</a>—the engine behind Wikipedia—that runs off your thumb drive.</p>
<p><strong>TiddlyWiki</strong></p>
<p>This afternoon I downloaded the flyweight portable wiki called TiddlyWiki. This is an amazingly tight little application—it comes in the form of a single fat web page that you copy to your thumb drive. As you make edits to your TiddlyWiki, the single html page is saved with your changes. Since it&#8217;s a single fancy file, backups are dead easy.</p>
<p>Here&#8217;s what it looks like when you first launch the &#8220;empty.html&#8221; file:</p>
<p><img class="alignnone size-medium wp-image-718" src="http://paperjammed.com/wp-content/uploads/2009/10/20091012-tiddly1-300x161.png" alt="" width="300" height="161" /></p>
<p>After a half hour of twiddling around, I had thrown together this basic set of &#8220;Tiddlers&#8221;</p>
<p><img class="alignnone size-full wp-image-720" src="http://paperjammed.com/wp-content/uploads/2009/10/20091012-tiddly2.png" alt="" width="626" height="720" /></p>
<p>In this screen shot you can see that there are now links that bring up custom &#8220;Tiddlers&#8221; for each computer and for each room. I have opened one of the little pages for <span style="color: #3366ff;"><strong>Computer21</strong></span>.</p>
<p>They describe these pages as being comparable to note cards. All in all, it is tight and easy to use.</p>
<p>Want to give it a try? Download it from the <a href="http://www.tiddlywiki.com/">TiddlyWiki</a> site. You really need to play with it to get a feel for what it can do!</p>
<p><strong>MediaWiki</strong></p>
<p>If you are looking for something with a little more meat on it, you can run the Wikipedia engine on your USB drive.</p>
<p>The easiest way to set this up is to let <a href="http://www.chsoftware.net/en/useware/mowes/mowes.htm">MoWeS</a> do everything for you. <strong>MoWeS</strong> stands for <strong>Mo</strong>dular <strong>We</strong>bserver <strong>S</strong>ystem. It&#8217;s a free product that you can configure as a self-contained Apache web server with a variety of cool apps like MediaWiki, running off a thumb drive.</p>
<p>Here&#8217;s how to set up MediaWiki in five minutes:</p>
<ul>
<li>Go to the <a href="http://www.chsoftware.net/en/useware/mowes/download.htm">MoWeS Mixer</a></li>
<li>The first time around choose &#8220;I do not have a <strong>MoWeS Portable II</strong> Package and want to obtain a new package&#8221; when prompted and click <strong>Go</strong>.</li>
<li>On the software lists, check <strong>Apache2</strong>, <strong>MySQL5</strong>, <strong>PHP5</strong>, and <strong>MediaWiki</strong></li>
<li>Click <strong>Download Now</strong></li>
<li>At this point they ask you some kind of question <em>in German</em>, to filter spambots, but it seems to be a simple math problem. Fill in the answer and click <strong>Submit Query</strong><br />
(&#8220;<em>Zum Schutz vor Downloadrobotern geben Sie bitte das Ergebnis dieser Aufgabe ein: 5 + 8 =  ?</em>&#8220;)</li>
<li>Unzip the downloaded zip file,  <strong>mowes_portable.zip</strong>, and copy the files to your USB drive</li>
<li>Open your thumb drive and double-click <strong>mowes.exe</strong></li>
<li>Select your language and accept the license</li>
<li>Click <strong>install</strong>, and confirm when prompted</li>
</ul>
<p>The installation process may take several minutes, but rest assured that it isn&#8217;t installing anything on your computer.</p>
<p><strong>Note: </strong>I received two or three firewall warnings for the Apache web server and the MySQL database. I had to click the &#8220;Unblock&#8221; button for all of them before my new MediaWiki-on-a-stick would work correctly.</p>
<p>After all of the dust settled, I have this little window on my screen:</p>
<p><img class="alignnone size-medium wp-image-725" title="20091012-MoWeS1" src="http://paperjammed.com/wp-content/uploads/2009/10/20091012-MoWeS1-300x209.png" alt="20091012-MoWeS1" width="300" height="209" /></p>
<p>In order to shut down and close out, just click the <strong>End</strong> button.</p>
<p>Once your MediaWiki USB key is running, you can go to this web page:</p>
<p><span style="color: #3366ff;">http://127.0.0.1/mediawiki/index.php/Main_Page</span></p>
<p><img class="alignnone size-full wp-image-726" src="http://paperjammed.com/wp-content/uploads/2009/10/20091012-MoWeS2.png" alt="" width="593" height="524" /></p>
<p>It looks just like Wikipedia, doesn&#8217;t it?</p>
<p>What a truly amazing thing: you can carry around your own Wikipedia server on a USB key and plug it in any random machine and start it up.</p>
<p><strong>Different Wiki Features</strong></p>
<p>As you try out different wiki software, you will notice that there are plenty of differences in the features they support:</p>
<ul>
<li>Each wiki has a different kind of editor. Some are visual; others are simple text editors.</li>
<li>The markup syntax you use for pages is different from wiki to wiki.</li>
<li>Most wikis support features such as &#8220;category pages&#8221; that find all pages tagged with a category.</li>
<li>Some support adding images and other content; others don&#8217;t. I imagine that TiddlyWiki probably has some means of embedding images, but I couldn&#8217;t find it.</li>
<li>A quick glance at the MediaWiki screenshot above shows extended features such as the Discussion tab and the History tab.</li>
<li>Some use the filesystem for their pages; others use a database.</li>
</ul>
<p>Since I wanted a central wiki for the whole school, I chose a different product from the portable wikis I discussed here—I decided to run <a href="http://moinmo.in/">MoinMoin</a> on a <a href="http://www.ubuntu.com/">Ubuntu</a> installation on an aging Gateway desktop machine. Nevertheless, the basic idea is still the same.</p>
<p>Once that arrangement becomes a little more stable I&#8217;ll write up a howto document, like the <a href="http://paperjammed.com/2009/02/15/new-life-for-an-old-clunker/">Linux NAS</a> one from a few months back.</p>
<p><strong>Other Sources</strong></p>
<p>There are loads of different personal wiki options out there and many people have written how-to documents and tutorials. Here&#8217;s a few:</p>
<ul>
<li><a href="http://lifehacker.com/354005/run-your-personal-wikipedia-from-a-usb-stick">Run Your Personal Wikipedia from a USB Stick</a> (Lifehacker.com)</li>
<li><a href="http://lifehacker.com/163707/geek-to-live--set-up-your-personal-wikipedia">Geek to Live: Set up your personal Wikipedia</a> (Lifehacker.com)</li>
<li><a href="http://www.pmwiki.org/wiki/Cookbook/WikiOnAStick">Wiki On A Stick</a> (PmWiki.org)</li>
<li><a href="http://cplus.about.com/od/thebusinessofsoftware/ss/woas.htm">Getting Started with Wiki on a Stick</a> (About.com)</li>
<li><a href="http://www.giffmex.org/twfortherestofus.html">TiddlyWiki for the rest of us</a> (giffmex)</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://paperjammed.com/2009/10/12/why-not-try-a-personal-wiki-for-some-of-your-more-amorphous-notes/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>When migrating to a new operating system, Look Before You Leap!</title>
		<link>http://paperjammed.com/2009/09/07/when-migrating-to-a-new-operating-system-look-before-you-leap/</link>
		<comments>http://paperjammed.com/2009/09/07/when-migrating-to-a-new-operating-system-look-before-you-leap/#comments</comments>
		<pubDate>Tue, 08 Sep 2009 02:56:21 +0000</pubDate>
		<dc:creator>Tad</dc:creator>
				<category><![CDATA[Backups]]></category>
		<category><![CDATA[Software]]></category>
		<category><![CDATA[Data Loss]]></category>
		<category><![CDATA[Geeky]]></category>
		<category><![CDATA[Macintosh]]></category>
		<category><![CDATA[Tips]]></category>

		<guid isPermaLink="false">http://paperjammed.com/?p=676</guid>
		<description><![CDATA[I can&#8217;t help it. As soon as I hear of a new version of anything, whether it&#8217;s an application or the entire operating system, I have to install it.
Now prudence would lead one to take careful steps and wait until all of the wrinkles are ironed out before starting. I was almost not prudent enough [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignright size-medium wp-image-685" src="http://paperjammed.com/wp-content/uploads/2009/09/iStock_000005873765XSmall-241x300.jpg" alt="" width="241" height="300" />I can&#8217;t help it. As soon as I hear of a new version of <em>anything</em>, whether it&#8217;s an application or the entire operating system, I have to install it.</p>
<p>Now prudence would lead one to take careful steps and wait until all of the wrinkles are ironed out before starting. I was almost not prudent enough this week.</p>
<p><strong>Mac OS X Snow Leopard</strong></p>
<p>So folks have been talking about the new <a href="http://en.wikipedia.org/wiki/Mac_OS_X_v10.6">Snow Leopard</a> operating system for Mac. Over the past year, Apple has been positioning this version as more of a &#8220;under the hood&#8221; upgrade that tightens things up rather than a glitzy overhaul of the user interface. No matter what they said it was, I figured that it was newer, and therefore better, than the current OS—Leopard–and I had to have it.</p>
<p>I ordered my copy last week on Amazon and sat down with a smile as I awaited its arrival. And then I thought about doing a few quick Googles to see how other people have been making out with Snow Leopard. I immediately happened upon a few upgrade guides <a href="http://www.cultofmac.com/how-to-upgrade-to-snow-leopard-the-right-way/15141">like this one</a>, providing sage advice about the upgrade process. They recommended the &#8220;slash and burn&#8221; method, starting from a clean hard drive, and I felt that was a good idea. Nothing better than a wipe and fresh install to make your machine zip along twice as fast. And therein lies a tale.<span id="more-676"></span></p>
<p><strong>The first sign of trouble</strong></p>
<p>As I was reading up on the Snow Leopard upgrade process, I happened upon lists of &#8220;unsupported software&#8221; and casually glanced at the lists, expecting esoteric tools only used by three über geeks in the audio recording industry or perhaps some exotic ray-tracing software. Much to my surprise, I saw two of my favorite applications, in a very very short list of troublesome apps: <a href="http://en.wikipedia.org/wiki/Parallels_Desktop_for_Mac">Parallels</a> and <a href="http://www.elgato.com/elgato/na/mainmenu/home/what-is-eyetv.en.html">EyeTV</a>.</p>
<p>I immediately checked the versions and breathed a sigh of relief when I saw that my EyeTV version was safe. But, Parallels was another story&#8230; They have no plans for patching Parallels 3 to work with Snow Leopard, and why should they, when they can sell us Parallels 4!</p>
<p>So, I ordered my fresh copy of Parallels 4, from Amazon with a twenty dollar rebate. When it arrived, I spent an evening upgrading Parallels, and I thought I was all set for Snow Leopard.</p>
<p><strong>Preventative Measures</strong></p>
<p>Following the advice of the upgrade websites, and prior experience, I used <a href="http://www.bombich.com/software/ccc.html">Carbon Copy Cloner</a> to make a full backup of my hard drive on a spare external drive. On a hunch, I turned on the drive that I use for <a href="http://en.wikipedia.org/wiki/Time_Machine_(Apple_software)">Time Machine</a> and had it do one final &#8220;Time Machine&#8221; sweep through the system before bidding <em>adiu</em> to Leopard.</p>
<p>I knew that I had all of my installation media for stuff like iLife and Photoshop Elements, and I had all of my license keys in electronic form. It would be a simple matter of mounting the backup drive, copying over my loads of documents, and peering into them to find keys.</p>
<p><strong>The first attempt</strong></p>
<p>I boldly inserted the Snow Leopard disk and booted from the DVD drive, selecting the &#8220;Slash and Burn&#8221; method of installation. I reformatted the hard drive and went off for dinner while Snow Leopard installed.</p>
<p><strong>Trouble</strong></p>
<p>When I got home that evening, I started the lengthy process of installing stuff. I suddenly realized that it was not as easy as I had hoped: it&#8217;s one thing to reinstall something like Microsoft Office, but there seemed to be more loose ends than I had considered:</p>
<ul>
<li>How would I migrate my Mail settings from the old image to the new?</li>
<li>What was the best way to migrate the Address Book contents?</li>
<li>iTunes is great, but it has tendrils in everything. Can I simply copy my old library to the new without messing up my iPhone, Address Book, or other linked stuff?</li>
<li>How about those nice password tools such as 1Password and SplashID that keep your passwords safe and sound? I had no clue how to get their contents from the backup. I wasn&#8217;t sure if it was even possible to do so—perhaps I was supposed to have exported the data beforehand.</li>
</ul>
<p>It was becoming clearer to me that I had not done my homework at all.</p>
<p><strong>More trouble</strong></p>
<p>My initial shock at the depth of the upgrade process led me to start making a list of applications and looking at what I needed for each one. I soon found out that Snow Leopard support is somewhat spotty in many applications. In particular, the FineReader for ScanSnap software that I depend on so much for my scanning work flow is <a href="http://www.documentsnap.com/abbyy-finereader-and-snow-leopard-file-not-created-with-scansnap/">not fully supported</a>. Fujitsu says that they will have an update soon and to keep checking their web site.</p>
<p>My password tool, 1Password, is <a href="http://www.switchersblog.com/2009/08/update-1password-on-snow-leopard.html">another problem child</a>. It works only on 32-bit Safari, and Snow Leopard now runs Safari in 64-bit mode. Of course, a new version is coming, and I will probably have to pay for it, but it is still in beta.</p>
<p>There was <a href="http://graphicssoft.about.com/b/2009/08/28/what-about-photoshop-elements-6-in-snow-leopard.htm">quite a bit of chatter</a> on the Web about whether Adobe Photoshop Elements would work on Snow Leopard, and the responses seem split fifty-fifty for now.</p>
<p>Three very important tools were in danger of running in limited mode or not running at all, so I had to throw in the towel.</p>
<p><strong>Time Machine saves the day!</strong></p>
<p>As I sat, humbled, before my vanilla install of Snow Leopard, I admitted defeat. I slipped the Snow Leopard DVD back in the drive and rebooted from the DVD. This time, I selected the &#8220;Restore from Time Machine&#8221; option and turned on my Time Machine drive.</p>
<p>Guess what? It worked perfectly! Unlike many software products, Time Machine does exactly what it promises.</p>
<p>Within a few hours, my machine was fully restored to the way it looked seconds before I made my first attempt at Snow Leopard.</p>
<p><strong>A Final Word</strong></p>
<p>Learn from my mistakes, and my salvation by the full backup. As much as you can&#8217;t wait to upgrade, please do the following:</p>
<ul>
<li>Inventory all of your applications that you really need.</li>
<li>Obtain the installation media (download or CD) for every single one.</li>
<li>Obtain the keys for every single one.</li>
<li>Investigate whether you need to export data from any of them, and make a checklist for these exports prior to upgrade.</li>
<li>Check the &#8220;Unsupported Software&#8221; lists that are out there for any red flags.</li>
<li>Check the web sites of your most important apps for their official word.</li>
<li>And finally, do a complete backup!</li>
</ul>
<p>It&#8217;s amazing how many applications and weird little utilities we forget we have. How could I have possibly remembered that I compiled a custom copy of the &#8220;rsync&#8221; executable for my backup workflow? I would have lost that and had to figure out how to rebuild it on Snow Leopard.</p>
<p>And I haven&#8217;t even talked about making sure your documents make it safely onto the new machine. That&#8217;s a whole &#8216;nother story.</p>
<p>In case I forgot to say it, please make a full backup.</p>
<p><strong>[Update: I'm giving Snow Leopard a rest for a few months]</strong></p>
<p>It has been said that Time Machine allows you to do a full restore from bare metal, and I&#8217;m living proof: I have done exactly that twice in the past week, with astounding success.</p>
<p>Encouraged by an episode of the <a href="http://www.macobserver.com/tmo/features/mac_geek_gab/">Mac Geek Gab</a> where they talked about their experiences upgrading their existing systems to Snow Leopard, I decided I would give the upgrade-in-place option a try. I expected some things to not work well and others to be quirky, but here&#8217;s what happened&#8230;</p>
<p>The actual install was painless, taking an hour or so to complete. I then began to kick the tires to see what was broken.</p>
<p>It was clear where those 64 bits went: apps like Safari were positively zippy, and I was pleasantly surprised with each new application I launched. All of my special settings seemed to make it through alive, including my password manager, though I did have to re-enter some of my registration keys. All of my mail and contacts made it through well. I was able to sync my iPhone without incident.</p>
<p>I found a few apps that weren&#8217;t working correctly and I looked for newer 10.6-compatible versions. I found newer versions of <a href="http://www.ironicsoftware.com/yep/">Yep</a> and <a href="http://alum.hampshire.edu/~bjk02/xGestures/">xGestures</a>.</p>
<p>I did note that there is currently no ad blocker available for Safari that runs in 64-bit mode. This is disappointing because even though I understand that Apple wants us to see <em>their</em> ads, I can&#8217;t imagine that they really want us to suffer from the flickering jumping dreck that should have ended with the hated &#8220;punch the monkey&#8221; banners of years gone by. The fact of the matter is, if I want that 64-bit speed and snap, I guess I have to watch ads.</p>
<p><strong>The Showstopper</strong></p>
<p>I decided to scan a document to see just how difficult it would be to get my workflow going again. Michael F, below, wrote the truth about the situation: the scanner works fine in certain modes, but the OCR software doesn&#8217;t.</p>
<p>He pointed out that it was a problem of the FineReader software looking for a specific bit of metadata in the PDF identifying it as a ScanSnap PDF. Sadly, that metadata string changed.</p>
<blockquote><p>The Finereader software is looking for “Mac OS X 10.5.8 Quartz PDFContext”, but under Snow Leopard, the string is set to “Mac OS X 10.6 Quartz PDFContext” instead.</p></blockquote>
<p>There are ways to tweak PDF metadata, and one of them is by using <a href="http://www.accesspdf.com/pdftk/">pdftk</a>.</p>
<p>I went to the pdftk site, all ready to download it and start OCRing my PDFs. I was greeted with less than optimal news: they have a version compiled for Panther, a version of OS X from several years ago.</p>
<p>I knew it wouldn&#8217;t work, but I gave it a try anyway: the app told me it needed Rosetta to run. I could have installed Rosetta at that point, but I figured I wanted a <em>proper</em> compiled version.</p>
<p>From there, I looked into compiling the app on OS X 10.6. I should have remembered my struggles with this several months ago on a Solaris Unix box when I found that pdftk depends on a monster called GCJ that required about forty other software packages to compile—it seemed a gargantuan task that I wasn&#8217;t ready to begin.</p>
<p>On a hunch, I inspected the content of a<em> new</em> pdf and an <em>old</em> pdf, the latter still acceptable to FineReader. Though much of the file was raw binary, the metadata was in text at the end. A short <a href="http://en.wikipedia.org/wiki/Sed">sed</a> script was all it took to swap the nice text string for the offending 10.6 one.</p>
<p>In spite of my best efforts, FineReader still rejected my hand-tooled PDF file. It knew that it was a bogus file.</p>
<p>I have looked into Abbyy FineReader several times before, as well as Fujitsu&#8217;s ScanSnap support, and was unimpressed. For two vendors that produce products that are at the top of their class—FineReader is arguably the best OCR you can get for Mac, and ScanSnap is the best document scanner for the common man—they sure do have miserable customer support.</p>
<p>It is as if neither company cares a whit about the Macintosh platform or their customers. While most other vendors are busily patching their products and giving hourly updates on their Snow Leopard compatibility progress, Abbyy and Fujitsu just don&#8217;t seem to care that their best-of-breed combo suddenly doesn&#8217;t work on Mac.</p>
<p>Once they get this sorted out (hopefully in the next few months) I&#8217;ll give Snow Leopard another try. In the meantime, I&#8217;m sticking with good old Leopard.</p>
]]></content:encoded>
			<wfw:commentRss>http://paperjammed.com/2009/09/07/when-migrating-to-a-new-operating-system-look-before-you-leap/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Automate ScanSnap OCR process on your Mac with AppleScript</title>
		<link>http://paperjammed.com/2009/08/29/automate-scansnap-ocr-process-on-your-mac-with-applescript/</link>
		<comments>http://paperjammed.com/2009/08/29/automate-scansnap-ocr-process-on-your-mac-with-applescript/#comments</comments>
		<pubDate>Sat, 29 Aug 2009 23:50:08 +0000</pubDate>
		<dc:creator>Tad</dc:creator>
				<category><![CDATA[Software]]></category>
		<category><![CDATA[Workflow]]></category>
		<category><![CDATA[Geeky]]></category>
		<category><![CDATA[PDF]]></category>
		<category><![CDATA[Scanning]]></category>
		<category><![CDATA[Scripting]]></category>
		<category><![CDATA[Searching and Indexing]]></category>

		<guid isPermaLink="false">http://paperjammed.com/?p=648</guid>
		<description><![CDATA[Some months back I wrote an article on using scripting languages to glue workflows together. My inspiration for that article was a bit of AppleScript that I had suffered over in order to smooth over a minor annoyance of my scan-to-OCR workflow.
I had promised that once I cleaned up the embarrassing bits of code I [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignright size-full wp-image-658" src="http://paperjammed.com/wp-content/uploads/2009/08/20090829-applescript.gif" alt="" width="128" height="128" />Some months back I wrote an article on using scripting languages to glue workflows together. My inspiration for that article was a bit of AppleScript that I had suffered over in order to smooth over a minor annoyance of my scan-to-OCR workflow.</p>
<p>I had promised that once I cleaned up the embarrassing bits of code I would post a perfect polished version here, but such promises are rarely fulfilled. A reader posted a comment asking for that source code, so I will post it here in its current state. The truth is, I have been using this script for months and, though it has some quirks, it works fine.</p>
<p>So this post is about Macintosh, AppleScript, and the ScanSnap-to-FineReader workflow. If these don&#8217;t interest you, better move on.</p>
<p><b>Update:</b> The script on this page works only with Leopard (10.5). Get the Snow Leopard version <a href="http://paperjammed.com/2010/01/04/automate-scansnap-ocr-process-on-your-mac-with-applescript-snow-leopard-edition/">here</a><br />
<span id="more-648"></span></p>
<p><strong>The Original Problem</strong></p>
<p>The Fujitsu ScanSnap S510m, my workhorse scanner, was designed to scan documents quickly and generate PDF files—this it does flawlessly. In order to provide OCR support, they have shipped a special version of <a href="http://finereader.abbyy.com/">FineReader</a>, called <strong>FineReader for ScanSnap</strong>. The standard OCR configuration is to chain the output of the scanner to the FineReader program.</p>
<p>The problem is that this forces scanning and OCR to run in lockstep: you scan a document, you wait for OCR, and then you scan another document.</p>
<p>My desire was to write a simple AppleScript that would detach the &#8220;Scan a Document&#8221; process from the &#8220;OCR&#8221; process. By using this script, I can scan documents at whatever rate pleases me, and the OCR engine will chunk along at its own pace, consuming my scanned documents and producing OCR documents.</p>
<p><strong>My Approach</strong></p>
<p>I really looked hard at the OCR application, trying to find AppleScript hooks or special command line switches that might allow me to control it better. Sadly, it was not designed to be scriptable. The only thing I could do is call the FineReader application with a source file.</p>
<p>Given this limitation, I considered writing a script that would look at a particular folder, identifying new files as they appear and passing them on to FineReader.</p>
<p>Fortunately, AppleScript provides this kind of functionality with little effort in the form of <strong>Folder Actions</strong>. Perhaps the best way to see these in action (and try it out) is to see this post on <a href="http://www.tuaw.com/2009/02/16/applescript-exploring-the-power-of-folder-actions-part-i/">Exploring the power of Folder Actions</a>.</p>
<p>In order to achieve my goals, I did the following:</p>
<ul>
<li>Created a folder called &#8220;Pending Documents&#8221;</li>
<li>Wrote the script to find the oldest-unprocessed-file and call FineReader with it</li>
<li>Attached the script to the folder as a Folder Action</li>
</ul>
<p><strong>The Script</strong></p>
<p>Let&#8217;s jump right in to the AppleScript. <a href="http://paperjammed.com/wp-content/uploads/2009/08/Run-OCR-on-New-Folder-Items.scpt">Download the script here.</a></p>
<div class="codecolorer-container applescript default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:435px;height:300px;"><table cellspacing="0" cellpadding="0"><tbody><tr><td style="padding:5px;text-align:center;color:#888888;background-color:#EEEEEE;border-right: 1px solid #9F9F9F;font: normal 12px/1.4em Monaco, Lucida Console, monospace;"><div>1<br />2<br />3<br />4<br />5<br />6<br />7<br />8<br />9<br />10<br />11<br />12<br />13<br />14<br />15<br />16<br />17<br />18<br />19<br />20<br />21<br />22<br />23<br />24<br />25<br />26<br />27<br />28<br />29<br />30<br />31<br />32<br />33<br />34<br />35<br />36<br />37<br />38<br />39<br />40<br />41<br />42<br />43<br />44<br />45<br />46<br />47<br />48<br />49<br />50<br />51<br />52<br />53<br />54<br />55<br />56<br />57<br />58<br />59<br />60<br />61<br />62<br />63<br />64<br />65<br />66<br />67<br />68<br />69<br />70<br />71<br />72<br />73<br />74<br />75<br />76<br />77<br />78<br />79<br />80<br />81<br />82<br />83<br />84<br />85<br />86<br />87<br />88<br />89<br />90<br />91<br />92<br />93<br />94<br />95<br />96<br />97<br />98<br />99<br />100<br />101<br />102<br />103<br />104<br />105<br />106<br />107<br />108<br />109<br />110<br />111<br />112<br />113<br />114<br />115<br />116<br />117<br />118<br />119<br />120<br />121<br />122<br />123<br />124<br />125<br />126<br />127<br />128<br />129<br />130<br />131<br />132<br />133<br />134<br />135<br />136<br />137<br />138<br />139<br />140<br />141<br />142<br />143<br />144<br />145<br />146<br />147<br />148<br />149<br />150<br />151<br />152<br />153<br />154<br />155<br />156<br />157<br />158<br />159<br />160<br />161<br />162<br />163<br />164<br />165<br />166<br />167<br />168<br />169<br />170<br />171<br />172<br />173<br />174<br />175<br />176<br />177<br />178<br />179<br />180<br />181<br />182<br />183<br />184<br />185<br />186<br />187<br />188<br />189<br />190<br />191<br />192<br />193<br />194<br />195<br />196<br />197<br />198<br />199<br />200<br />201<br />202<br />203<br />204<br />205<br />206<br />207<br />208<br />209<br />210<br />211<br />212<br /></div></td><td><div class="applescript codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap"><span style="color: #808080; font-style: italic;">(*<br />
This is a folder listener script that will act as a queue, receiving<br />
PDF files from the ScanSnap scanner and feeding them, one by one, to<br />
the Abbyy FineReader OCR software.<br />
<br />
This allows you to keep scanning while the OCR job runs in the background<br />
on all of the unprocessed files.<br />
<br />
Why do we want to do this?<br />
<br />
The ScanSnap Manager software does not support this by default, so<br />
when you scan in a file, it sends it to FineReader for OCR. You then<br />
must wait until FineReader finishes its work before scanning in another<br />
document.<br />
<br />
This script allows you to keep scanning without waiting for OCR.<br />
<br />
Installation:<br />
<br />
o &nbsp; Copy this script to:<br />
<br />
&nbsp; &nbsp; &lt;home&gt;/Library/Scripts/Folder Action Scripts<br />
<br />
&nbsp; &nbsp; You may have to create the &quot;Folder Action Scripts&quot; folder.<br />
<br />
o &nbsp; Now open a Finder window, control-click and choose:<br />
<br />
&nbsp; &nbsp; More / Configure Folder Actions...<br />
<br />
o &nbsp; Check the &quot;Enable Folder Actions&quot; checkbox, if not checked<br />
o &nbsp; Click the &quot;+&quot; in the bottom left<br />
o &nbsp; Select a folder and click Open<br />
o &nbsp; Choose the script &quot;Run OCR on New Folder Items&quot; and click Attach<br />
<br />
Copyright (C) 2009 Tad Harrison<br />
*)</span><br />
<span style="color: #ff0033; font-weight: bold;">on</span> <span style="color: #0066ff;">adding</span> <span style="color: #0066ff;">folder</span> <span style="color: #0066ff;">items</span> <span style="color: #ff0033; font-weight: bold;">to</span> this_folder <span style="color: #ff0033;">after</span> <span style="color: #0066ff;">receiving</span> added_items<br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">try</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;">-- Just in case FineReader is running, wait until it is ready</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; waitForFineReaderFinish<span style="color: #000000;">&#40;</span><span style="color: #000000;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> moreWorkToDo <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #0066ff;">true</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">repeat</span> <span style="color: #ff0033; font-weight: bold;">while</span> moreWorkToDo<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> aFile <span style="color: #ff0033; font-weight: bold;">to</span> getNextFile<span style="color: #000000;">&#40;</span>this_folder<span style="color: #000000;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">if</span> <span style="color: #ff0033;">not</span> aFile <span style="color: #000000;">=</span> <span style="color: #009900;">&quot;&quot;</span> <span style="color: #ff0033; font-weight: bold;">then</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; log <span style="color: #0066ff;">POSIX path</span> <span style="color: #ff0033; font-weight: bold;">of</span> aFile<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; ocrFile<span style="color: #000000;">&#40;</span>aFile<span style="color: #000000;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">else</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> moreWorkToDo <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #0066ff;">false</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">end</span> <span style="color: #ff0033; font-weight: bold;">if</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">end</span> <span style="color: #ff0033; font-weight: bold;">repeat</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; exitApp<span style="color: #000000;">&#40;</span><span style="color: #009900;">&quot;FineReader for ScanSnap&quot;</span><span style="color: #000000;">&#41;</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">on</span> <span style="color: #ff0033; font-weight: bold;">error</span> errorStr <span style="color: #0066ff;">number</span> errNum<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #0066ff;">display dialog</span> <span style="color: #009900;">&quot;Error &quot;</span> <span style="color: #000000;">&amp;</span> errNum <span style="color: #000000;">&amp;</span> <span style="color: #009900;">&quot; while running OCR: &quot;</span> <span style="color: #000000;">&amp;</span> errorStr<br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">end</span> <span style="color: #ff0033; font-weight: bold;">try</span><br />
<span style="color: #ff0033; font-weight: bold;">end</span> <span style="color: #0066ff;">adding</span> <span style="color: #0066ff;">folder</span> <span style="color: #0066ff;">items</span> <span style="color: #ff0033; font-weight: bold;">to</span><br />
<span style="color: #808080; font-style: italic;">(*<br />
Name: ocrFile<br />
Description: Runs OCR on the next un-OCR'd file<br />
Parameters:<br />
&nbsp; aFile - the file to be OCR'd<br />
*)</span><br />
<span style="color: #ff0033; font-weight: bold;">on</span> ocrFile<span style="color: #000000;">&#40;</span>aFile<span style="color: #000000;">&#41;</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">tell</span> <span style="color: #0066ff;">application</span> <span style="color: #009900;">&quot;FineReader for ScanSnap&quot;</span> <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #0066ff;">open</span> aFile<br />
&nbsp; &nbsp; <span style="color: #808080; font-style: italic;">-- Make sure FineReader actually starts before we start waiting for it to stop</span><br />
&nbsp; &nbsp; waitForFineReaderStart<span style="color: #000000;">&#40;</span><span style="color: #000000;">&#41;</span><br />
&nbsp; &nbsp; <span style="color: #808080; font-style: italic;">-- Now wait 'till it's done so we do one file at a time</span><br />
&nbsp; &nbsp; waitForFineReaderFinish<span style="color: #000000;">&#40;</span><span style="color: #000000;">&#41;</span><br />
<span style="color: #ff0033; font-weight: bold;">end</span> ocrFile<br />
<span style="color: #808080; font-style: italic;">(*<br />
Name: appIsRunning<br />
Description: Determines if a particular application is running.<br />
Parameters:<br />
&nbsp; &nbsp; appName - the name of the application to be tested<br />
Returns: True if the application is running; otherwise False<br />
*)</span><br />
<span style="color: #ff0033; font-weight: bold;">on</span> appIsRunning<span style="color: #000000;">&#40;</span>appName<span style="color: #000000;">&#41;</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">tell</span> <span style="color: #0066ff;">application</span> <span style="color: #009900;">&quot;System Events&quot;</span> <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #000000;">&#40;</span><span style="color: #0066ff;">name</span> <span style="color: #ff0033; font-weight: bold;">of</span> processes<span style="color: #000000;">&#41;</span> <span style="color: #ff0033;">contains</span> appName<br />
<span style="color: #ff0033; font-weight: bold;">end</span> appIsRunning<br />
<span style="color: #808080; font-style: italic;">(*<br />
Name: exitApp<br />
Description: Exits the specified app if it is running.<br />
Parameters:<br />
&nbsp; &nbsp; appName - the application name<br />
*)</span><br />
<span style="color: #ff0033; font-weight: bold;">on</span> exitApp<span style="color: #000000;">&#40;</span>appName<span style="color: #000000;">&#41;</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">if</span> appIsRunning<span style="color: #000000;">&#40;</span>appName<span style="color: #000000;">&#41;</span> <span style="color: #ff0033; font-weight: bold;">then</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">tell</span> <span style="color: #0066ff;">application</span> appName <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #0066ff;">quit</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">end</span> <span style="color: #ff0033; font-weight: bold;">if</span><br />
<span style="color: #ff0033; font-weight: bold;">end</span> exitApp<br />
<span style="color: #808080; font-style: italic;">(*<br />
Name: getNextFile<br />
Description: Finds the next unprocessed ScanSnap PDF<br />
Return: the file or &quot;&quot;<br />
*)</span><br />
<span style="color: #ff0033; font-weight: bold;">on</span> getNextFile<span style="color: #000000;">&#40;</span>aFolder<span style="color: #000000;">&#41;</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> masterFileList <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #0066ff;">list</span> <span style="color: #0066ff;">folder</span> aFolder ¬<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">without</span> <span style="color: #0066ff;">invisibles</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> posixPath <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #0066ff;">POSIX path</span> <span style="color: #ff0033; font-weight: bold;">of</span> aFolder<br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">repeat</span> <span style="color: #ff0033; font-weight: bold;">with</span> i <span style="color: #ff0033; font-weight: bold;">from</span> <span style="color: #000000;">1</span> <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #0066ff;">count</span> masterFileList<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> fileName <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #0066ff;">item</span> i <span style="color: #ff0033; font-weight: bold;">of</span> masterFileList<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> posixFilePath <span style="color: #ff0033; font-weight: bold;">to</span> posixPath <span style="color: #000000;">&amp;</span> fileName<br />
&nbsp; &nbsp; &nbsp; &nbsp; log posixFilePath<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;">--</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;">-- Construct a FineReader file name from our file</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;">--</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> posixBaseName <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #0066ff;">do shell script</span> ¬<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #009900;">&quot;filename=&quot;</span> <span style="color: #000000;">&amp;</span> <span style="color: #0066ff;">quoted form</span> <span style="color: #ff0033; font-weight: bold;">of</span> posixFilePath <span style="color: #000000;">&amp;</span> <span style="color: #009900;">&quot;; echo ${filename%<span style="color: #000000; font-weight: bold;">\\</span>.*}&quot;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; log <span style="color: #000000;">&#40;</span><span style="color: #009900;">&quot;Name: &quot;</span> <span style="color: #000000;">&amp;</span> posixBaseName<span style="color: #000000;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> posixOcrFilePath <span style="color: #ff0033; font-weight: bold;">to</span> posixBaseName <span style="color: #000000;">&amp;</span> <span style="color: #009900;">&quot; processed by FineReader.pdf&quot;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;">--</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;">-- See if the FineReader file we constructed exists</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;">--</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">tell</span> <span style="color: #0066ff;">application</span> <span style="color: #009900;">&quot;System Events&quot;</span> <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #ff0033; font-weight: bold;">set</span> ocrFileExists <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #0066ff;">exists</span> <span style="color: #0066ff;">file</span> posixOcrFilePath<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">if</span> ocrFileExists <span style="color: #ff0033; font-weight: bold;">then</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; log <span style="color: #000000;">&#40;</span><span style="color: #009900;">&quot;OCR file found for &quot;</span> <span style="color: #000000;">&amp;</span> posixBaseName<span style="color: #000000;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">end</span> <span style="color: #ff0033; font-weight: bold;">if</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">tell</span> <span style="color: #0066ff;">me</span> <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #ff0033; font-weight: bold;">set</span> fileCreator <span style="color: #ff0033; font-weight: bold;">to</span> getSpotlightInfo for <span style="color: #009900;">&quot;kMDItemCreator&quot;</span> <span style="color: #ff0033; font-weight: bold;">from</span> posixFilePath<br />
&nbsp; &nbsp; &nbsp; &nbsp; log <span style="color: #000000;">&#40;</span><span style="color: #009900;">&quot;Creator: &quot;</span> <span style="color: #000000;">&amp;</span> fileCreator<span style="color: #000000;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">if</span> <span style="color: #ff0033;">not</span> ocrFileExists <span style="color: #ff0033;">and</span> fileCreator <span style="color: #000000;">=</span> <span style="color: #009900;">&quot;ScanSnap Manager&quot;</span> <span style="color: #ff0033; font-weight: bold;">then</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">return</span> <span style="color: #0066ff;">POSIX file</span> posixFilePath<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">end</span> <span style="color: #ff0033; font-weight: bold;">if</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">end</span> <span style="color: #ff0033; font-weight: bold;">repeat</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">return</span> <span style="color: #009900;">&quot;&quot;</span><br />
<span style="color: #ff0033; font-weight: bold;">end</span> getNextFile<br />
<span style="color: #808080; font-style: italic;">(*<br />
Name: getSpotlightInfo<br />
Description: Gets a named attribute from metadata for a specific file.<br />
Parameters:<br />
&nbsp; &nbsp; for myattribute - the name of the attribute<br />
&nbsp; &nbsp; from myfile - the name of the file<br />
Returns: the attribute value or &quot;&quot; if none found<br />
*)</span><br />
<span style="color: #ff0033; font-weight: bold;">on</span> getSpotlightInfo for myattribute <span style="color: #ff0033; font-weight: bold;">from</span> myfile<br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">try</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> this_kMDItemResult <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #009900;">&quot;&quot;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">tell</span> <span style="color: #0066ff;">application</span> <span style="color: #009900;">&quot;Finder&quot;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> this_item <span style="color: #ff0033; font-weight: bold;">to</span> myfile <span style="color: #ff0033;">as</span> <span style="color: #0066ff;">string</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> this_item <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #0066ff;">POSIX path</span> <span style="color: #ff0033; font-weight: bold;">of</span> this_item<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> this_kMDItem <span style="color: #ff0033; font-weight: bold;">to</span> myattribute<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> theResult <span style="color: #ff0033; font-weight: bold;">to</span> words <span style="color: #ff0033; font-weight: bold;">of</span> <span style="color: #000000;">&#40;</span><span style="color: #0066ff;">do shell script</span> <span style="color: #009900;">&quot;/usr/bin/mdls -name &quot;</span> <span style="color: #000000;">&amp;</span> this_kMDItem <span style="color: #000000;">&amp;</span> <span style="color: #009900;">&quot; -raw -nullMarker None &quot;</span> <span style="color: #000000;">&amp;</span> <span style="color: #0066ff;">quoted form</span> <span style="color: #ff0033; font-weight: bold;">of</span> this_item<span style="color: #000000;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; log <span style="color: #009900;">&quot;Result: &quot;</span> <span style="color: #000000;">&amp;</span> theResult <span style="color: #ff0033;">as</span> <span style="color: #0066ff;">string</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">repeat</span> <span style="color: #ff0033; font-weight: bold;">with</span> j <span style="color: #ff0033; font-weight: bold;">from</span> <span style="color: #000000;">1</span> <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #0066ff;">number</span> <span style="color: #ff0033; font-weight: bold;">of</span> <span style="color: #0066ff;">items</span> <span style="color: #ff0033; font-weight: bold;">in</span> theResult<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> this_kMDItemResult <span style="color: #ff0033; font-weight: bold;">to</span> this_kMDItemResult <span style="color: #000000;">&amp;</span> <span style="color: #0066ff;">item</span> j <span style="color: #ff0033; font-weight: bold;">of</span> theResult <span style="color: #ff0033;">as</span> <span style="color: #0066ff;">string</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">if</span> j <span style="color: #000000;">&lt;</span> <span style="color: #0066ff;">number</span> <span style="color: #ff0033; font-weight: bold;">of</span> <span style="color: #0066ff;">items</span> <span style="color: #ff0033; font-weight: bold;">in</span> theResult <span style="color: #ff0033; font-weight: bold;">then</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> this_kMDItemResult <span style="color: #ff0033; font-weight: bold;">to</span> this_kMDItemResult <span style="color: #000000;">&amp;</span> <span style="color: #009900;">&quot; &quot;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">end</span> <span style="color: #ff0033; font-weight: bold;">if</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">end</span> <span style="color: #ff0033; font-weight: bold;">repeat</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">end</span> <span style="color: #ff0033; font-weight: bold;">tell</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">on</span> <span style="color: #ff0033; font-weight: bold;">error</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> this_kMDItemResult <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #009900;">&quot;&quot;</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">end</span> <span style="color: #ff0033; font-weight: bold;">try</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">return</span> this_kMDItemResult<br />
<span style="color: #ff0033; font-weight: bold;">end</span> getSpotlightInfo<br />
<span style="color: #808080; font-style: italic;">(*<br />
Name: waitForFineReaderFinish<br />
Description: Waits until FineReader OCR is complete.<br />
Returns: True if FineReader OCR is complete; otherwise False<br />
<br />
This procedure constantly loops through open FineReader windows looking<br />
for the window called &quot;Converting the Document&quot;<br />
Once that window goes away, the procedure exits.<br />
*)</span><br />
<span style="color: #ff0033; font-weight: bold;">on</span> waitForFineReaderFinish<span style="color: #000000;">&#40;</span><span style="color: #000000;">&#41;</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">if</span> <span style="color: #ff0033;">not</span> appIsRunning<span style="color: #000000;">&#40;</span><span style="color: #009900;">&quot;FineReader for ScanSnap&quot;</span><span style="color: #000000;">&#41;</span> <span style="color: #ff0033; font-weight: bold;">then</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">return</span> <span style="color: #0066ff;">false</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">end</span> <span style="color: #ff0033; font-weight: bold;">if</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">tell</span> <span style="color: #0066ff;">application</span> <span style="color: #009900;">&quot;System Events&quot;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> window_found <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #0066ff;">true</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">repeat</span> <span style="color: #ff0033; font-weight: bold;">until</span> <span style="color: #ff0033;">not</span> window_found<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> ew <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #0066ff;">name</span> <span style="color: #ff0033; font-weight: bold;">of</span> <span style="color: #ff0033;">every</span> <span style="color: #0066ff;">window</span> <span style="color: #ff0033; font-weight: bold;">of</span> <span style="color: #0066ff;">application</span> process <span style="color: #009900;">&quot;FineReader for ScanSnap&quot;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">if</span> ew <span style="color: #ff0033;">contains</span> <span style="color: #009900;">&quot;Converting the Document&quot;</span> <span style="color: #ff0033; font-weight: bold;">then</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> window_found <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #0066ff;">true</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; delay <span style="color: #000000;">1</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">else</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> window_found <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #0066ff;">false</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">end</span> <span style="color: #ff0033; font-weight: bold;">if</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">end</span> <span style="color: #ff0033; font-weight: bold;">repeat</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">end</span> <span style="color: #ff0033; font-weight: bold;">tell</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">return</span> <span style="color: #0066ff;">true</span><br />
<span style="color: #ff0033; font-weight: bold;">end</span> waitForFineReaderFinish<br />
<span style="color: #808080; font-style: italic;">(*<br />
Name: waitForFineReaderStart<br />
Description: Waits until FineReader OCR has begun.<br />
Returns: True if FineReader OCR has started; otherwise False<br />
<br />
This procedure is used to give FineReader a moment to actually start<br />
chewing on a file. It simply waits for the &quot;Converting the Document&quot;<br />
window to appear.<br />
In order to avoid a permanent loop if FineReader doesn't<br />
start, this times out after 30 seconds.<br />
*)</span><br />
<span style="color: #ff0033; font-weight: bold;">on</span> waitForFineReaderStart<span style="color: #000000;">&#40;</span><span style="color: #000000;">&#41;</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">if</span> <span style="color: #ff0033;">not</span> appIsRunning<span style="color: #000000;">&#40;</span><span style="color: #009900;">&quot;FineReader for ScanSnap&quot;</span><span style="color: #000000;">&#41;</span> <span style="color: #ff0033; font-weight: bold;">then</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">return</span> <span style="color: #0066ff;">false</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">end</span> <span style="color: #ff0033; font-weight: bold;">if</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">with</span> <span style="color: #ff0033; font-weight: bold;">timeout</span> <span style="color: #ff0033; font-weight: bold;">of</span> <span style="color: #000000;">30</span> seconds<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">tell</span> <span style="color: #0066ff;">application</span> <span style="color: #009900;">&quot;System Events&quot;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> window_found <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #0066ff;">false</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">repeat</span> <span style="color: #ff0033; font-weight: bold;">until</span> window_found<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> ew <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #0066ff;">name</span> <span style="color: #ff0033; font-weight: bold;">of</span> <span style="color: #ff0033;">every</span> <span style="color: #0066ff;">window</span> <span style="color: #ff0033; font-weight: bold;">of</span> <span style="color: #0066ff;">application</span> process <span style="color: #009900;">&quot;FineReader for ScanSnap&quot;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">if</span> ew <span style="color: #ff0033;">contains</span> <span style="color: #009900;">&quot;Converting the Document&quot;</span> <span style="color: #ff0033; font-weight: bold;">then</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> window_found <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #0066ff;">true</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">else</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">set</span> window_found <span style="color: #ff0033; font-weight: bold;">to</span> <span style="color: #0066ff;">false</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; delay <span style="color: #000000;">1</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">end</span> <span style="color: #ff0033; font-weight: bold;">if</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">end</span> <span style="color: #ff0033; font-weight: bold;">repeat</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">end</span> <span style="color: #ff0033; font-weight: bold;">tell</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">end</span> <span style="color: #ff0033; font-weight: bold;">timeout</span><br />
&nbsp; &nbsp; <span style="color: #ff0033; font-weight: bold;">return</span> <span style="color: #0066ff;">true</span><br />
<span style="color: #ff0033; font-weight: bold;">end</span> waitForFineReaderStart</div></td></tr></tbody></table></div>
<p><strong>Installation</strong></p>
<ul>
<li>Use the Script Editor to save this script as <strong>Run OCR on New Folder Items</strong> under <strong><em>User Home</em>/Library/Scripts/Folder Action Scripts</strong>You may have to create the <strong>Folder Action Scripts </strong>folder.</li>
<li>Now open a Finder window, control-click and choose <strong>More / Configure Folder Actions&#8230;</strong></li>
<li>Check the <strong>Enable Folder Actions</strong> checkbox, if not checked</li>
<li>Click the &#8220;+&#8221; in the bottom left</li>
<li>Select a folder and click <strong>Open</strong></li>
<li>Choose the script <strong>Run OCR on New Folder Items</strong> and click <strong>Attach</strong></li>
</ul>
<p><strong>Picky Details</strong></p>
<p>As you can see in the source code, there were several issues to address:</p>
<ul>
<li>I had to make sure the script didn&#8217;t step on itself. If FineReader was running, I would wait until it was ready before processing.</li>
<li>The script needed to determine which files had been processed already. This was handled fairly trivially by looking for a matching file with the <strong>processed by FineReader.pdf</strong> suffix. In other words, if I was looking at <strong>Scan001.pdf</strong>, I would see if there was a matching <strong>Scan001 processed by FineReader.pdf</strong> file.</li>
<li>Part of checking for a source file&#8217;s &#8220;buddy&#8221; was stripping off the PDF suffix. This was done in a hackish way by using a one-line shell script, at lines 106-107.</li>
<li>I thought it was important to verify that the source file was, indeed, a ScanSnap file—the FineReader will not process other PDF documents. This was done at lines 117-121 by looking at the Spotlight metadata for the Creator of the source file. That took some more shell scripting (133-154).</li>
<li>The actual work was done by a single line, line 63.</li>
</ul>
<p>The real work was fairly simple, while the bulk of the code was needed to polish pesky little details. Isn&#8217;t that the way code development often is?</p>
<p>If anyone has any improvements on my script, please let me know!</p>
]]></content:encoded>
			<wfw:commentRss>http://paperjammed.com/2009/08/29/automate-scansnap-ocr-process-on-your-mac-with-applescript/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
		<item>
		<title>Banish the kids to their own network!</title>
		<link>http://paperjammed.com/2009/06/02/banish-the-kids-to-their-own-network/</link>
		<comments>http://paperjammed.com/2009/06/02/banish-the-kids-to-their-own-network/#comments</comments>
		<pubDate>Wed, 03 Jun 2009 00:16:43 +0000</pubDate>
		<dc:creator>Tad</dc:creator>
				<category><![CDATA[Security]]></category>
		<category><![CDATA[Software]]></category>
		<category><![CDATA[Data Loss]]></category>
		<category><![CDATA[Geeky]]></category>
		<category><![CDATA[Networking]]></category>
		<category><![CDATA[Portable Devices]]></category>
		<category><![CDATA[Privacy]]></category>
		<category><![CDATA[Windows]]></category>

		<guid isPermaLink="false">http://paperjammed.com/?p=557</guid>
		<description><![CDATA[A nastygram from my ISP let me know that I needed to take action to lock down my home network. In this article I discuss using a spare router in a somewhat unusual daisy chain configuration in order to banish the teenagers and all of their wifi devices to their own network.]]></description>
			<content:encoded><![CDATA[<p><img class="alignright size-medium wp-image-560" src="http://paperjammed.com/wp-content/uploads/2009/06/istock_000006562749xsmall-300x210.jpg" alt="" width="300" height="210" />A few weeks ago I received an unpleasant bit of email from my Internet provider. At first, I thought it was yet another lame spammer or phisher sending me some official-looking notice, but after a moment&#8217;s inspection I realized that this was a real <em>bona-fide </em>official notice.</p>
<p>Their network security department very kindly (and politely) informed me that they had received a &#8220;cease and desist&#8221; order from a particular game publisher. They had included the game publisher&#8217;s email, complete with the incriminating evidence.</p>
<p>There it was: logs showing the MAC address of my cable modem being involved in suspicious <a href="http://en.wikipedia.org/wiki/BitTorrent_(protocol)">BitTorrent</a> activities.</p>
<p>Considering that at any time during the week there can be from two to six or seven different teenagers hanging out in my humble abode, carrying virus-ridden machines, the message was clear: I had to get serious about locking down network access<span id="more-557"></span></p>
<p><strong>The Problem</strong></p>
<p>I would have liked to have bought some net filtering software to slap on the offending machine and been done with it, however I knew that this was insufficient.</p>
<p>Even if this one event could be traced to a youthful source, a more ominous danger comes from the inevitable malware and viruses that teenagers collect on their machines as they swap cool stuff with their friends.</p>
<p>Complicating things, there are many devices on our home network: Besides their school laptops, the kids have video game consoles and one has an iPod touch, all with wifi access. Think about how many different gadgets are on <em>your</em> home network.</p>
<p>And shutting off access altogether was not an option—there is still schoolwork to be done!</p>
<p><strong>The answer: A Private Network for the Kids</strong></p>
<p>My solution was to put together an unusual network configuration using a second wireless router; I wanted the ability to manage every single kid-owned device at the flip of a switch, while leaving the grownups untouched.</p>
<p><img class="aligncenter size-full wp-image-568" src="http://paperjammed.com/wp-content/uploads/2009/06/20090602-network-devices.gif" alt="" width="600" height="550" /></p>
<p>I hooked the cable modem (<strong>red</strong>) to the main router, shown in <strong>green</strong>. I then plugged a second wireless router, shown in <strong>blue</strong>, into the first.</p>
<p>By doing this, you can see that there is <em>one single wire</em> connecting the entire <strong>blue</strong> network (the kids) to the <strong>green</strong> network. It was trivial to then configure the green<em> </em>router with appropriate access control and filtering for that one single device: the blue router.</p>
<p><strong>Some quirky details</strong></p>
<p>Home routers like these are, by default, configured with a <a href="http://en.wikipedia.org/wiki/Network_address_translation">NAT</a> firewall. They work sort of like one-way mirrors: someone on the network can see out, but nobody can see in. As a result of this, the kids (<strong>blue</strong> devices) can see any device on the main router (<strong>green</strong> devices), such as our print server and the NAS device, but no one can see <em>into</em> the kids&#8217; network.</p>
<p>As paradoxical as it seems, this is exactly what I wanted. By making the kids&#8217; network a private network, it appears to the green router as a single device. When I am configuring access restrictions, I only need to control access for the blue router&#8217;s IP address or MAC address.</p>
<p>Many consumer-grade routers have flakey firmware that just doesn&#8217;t really behave well when you start doing things like turning on filtering for multiple machines. I simplified things by bringing down the number of controlled devices to <em>one</em>. In addition, if one were to try filtering on the IP addresses or MAC addresses of individual machines, this can be easily defeated by manually changing the IP address or MAC address. With my configuration, the MAC address being filtered is the blue router, locked away safely.</p>
<p><strong>The Finer Points</strong></p>
<p>If you want to set up a network like this, do the following:</p>
<ul>
<li>(Recommended) Reset the kids&#8217; router. Hold the hard reset button on the router in while you turn on power; hold the button for 15 seconds or so.</li>
<li>Hook the kids&#8217; router up to a spare laptop using an Ethernet cable. (Turn off the wireless of the laptop for the time being).</li>
<li>Use the laptop to navigate to the configuration web page (usually 192.168.1.1).</li>
<li>Set the router&#8217;s own address to a <em>different</em> network from the main network, such as 192.168.<strong>2</strong>.1. <em>This is critical</em>.</li>
<li>Configure the router&#8217;s gateway and DHCP server entries to all point to the <em>main</em> router (192.168.1.1). This tells the kids&#8217; router to use the main router as a source for its DHCP lookups and such, rather than going to cable modem.</li>
<li>Navigate to the configuration web page at the new address (192.168.2.1). You may need to close the browser and replug the Ethernet cable.</li>
<li>Set up your wireless security for the kids however you like. Make sure to choose a different channel and SSID from your main router.</li>
<li>Remove the laptop and plug the WAN port of the kids&#8217; router into one of the LAN ports of the main router. Restart everything.</li>
<li>Test both networks to make sure things work the way you think they should.</li>
<li>(Optional) You might want to connect to the kids&#8217; router and set it&#8217;s external IP address statically. Make sure that this is set to a number on the home network (e.g. 192.168.1.2).</li>
</ul>
<p>Some notes:</p>
<ul>
<li>You can only maintain the kids&#8217; router from a machine connected to the kids&#8217; network; the home network cannot see the management screens. If you wish, you could enable remote management for the kids&#8217; network only, since the main home router is still protecting the whole network from intruders.</li>
<li>Computers on the kids&#8217; network can see all devices, but they aren&#8217;t on the same network. This means that network printers and NAS devices are accessible, but you will have to attach to them using IP addresses. I was able to easily set up the machines on the 192.168.2.1 network to use a print server on 192.168.1.100.</li>
<li>For machines that should have full access (a.k.a. <em>yours</em>), make sure that you either set the <strong>green</strong> network to be a higher priority or remove the <strong>blue</strong> network SSID entry altogether. I found out the hard way that my iMac would randomly pick the green or the blue depending on which one it saw first when it woke up.</li>
<li>This does <em>not</em> wall off your main network; it simply provides a single point of control to the entire kids&#8217; network. In other words, don&#8217;t depend on this setup to prevent malware on the kids machines from seeing your machine. You can, however, set up your PC to not trust the kids&#8217; network.</li>
</ul>
<p><strong>Wireless Network Security</strong></p>
<p>Regardless of how you set up your network, make sure you use at least WPA encryption (Never use WEP!). Make sure your passwords are solid.</p>
<p><strong>Using DD-WRT on my new wireless router</strong></p>
<p>In addition to the new network configuration, I went one step further and chose a main router that lends itself well to installation of open-source firmware. I ordered a <a href="http://www.amazon.com/Linksys-Cisco-WRT54GL-Wireless-G-Broadband-Compatible/dp/B000BTL0OA/ref=sr_1_1?ie=UTF8&amp;s=electronics&amp;qid=1243905597&amp;sr=8-1">Linksys WRT54GL</a> from Amazon for a little over fifty bucks. I chose this one because, as a direct descendent of the venerable <a href="http://en.wikipedia.org/wiki/WRT54G">WRT54G</a>, this router is very well suited for running alternative firmware such as <a href="http://en.wikipedia.org/wiki/Dd-wrt">DD-WRT</a>, giving substantial control over things like, say, access control&#8230;</p>
<p>Within a half hour after my new router arrived, I had gone to the <a href="http://www.dd-wrt.com/dd-wrtv3/dd-wrt/hardware.html">Supported Hardware</a> page, obtained the latest build of DD-WRT, and replaced the Linksys firmware with the far-better open source code.</p>
<p>I won&#8217;t go into the specifics of installation here, but it isn&#8217;t very challenging. Check out the <a href="http://www.dd-wrt.com/dd-wrtv3/index.php">DD-WRT site</a> for details.</p>
<p><strong>Closing Thoughts</strong></p>
<p>Make no mistake: we are responsible for whatever goes on our home networks. Just like your home telephone; if someone dials up some 900 number and rings up a thousand-dollar phone bill, the phone company won&#8217;t care a whit who did it, you will still pay. Likewise, regardless of who did the BitTorrent download, there is a certain degree of responsibility of the homeowner to lock down the network.</p>
<p>Another point: Without some degree of personal responsibility on the part of the kids in the house, this sort of activity would simply be an arms race of filtering and blocking versus hacking. My goal is to help keep the honest people honest and to make life more difficult for the viruses and malware.</p>
]]></content:encoded>
			<wfw:commentRss>http://paperjammed.com/2009/06/02/banish-the-kids-to-their-own-network/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>A cheap and cheerful way to reduce Internet surprises</title>
		<link>http://paperjammed.com/2009/05/26/a-cheap-and-cheerful-way-to-reduce-internet-surprises/</link>
		<comments>http://paperjammed.com/2009/05/26/a-cheap-and-cheerful-way-to-reduce-internet-surprises/#comments</comments>
		<pubDate>Tue, 26 May 2009 21:51:14 +0000</pubDate>
		<dc:creator>Tad</dc:creator>
				<category><![CDATA[Paperless Life]]></category>
		<category><![CDATA[Security]]></category>
		<category><![CDATA[Software]]></category>
		<category><![CDATA[Geeky]]></category>
		<category><![CDATA[Good Sites]]></category>
		<category><![CDATA[Online Services]]></category>
		<category><![CDATA[Privacy]]></category>
		<category><![CDATA[Reviews]]></category>

		<guid isPermaLink="false">http://paperjammed.com/?p=539</guid>
		<description><![CDATA[Anyone who has kids in their home worries about how easy it is to access the seamier side of the Internet, even if by accident. Indeed, it is thrust upon us in our email in-boxes daily in the form of misspelled spam with links that only a fool would click.
Another issue altogether is the spam [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignright size-medium wp-image-542" src="http://paperjammed.com/wp-content/uploads/2009/05/istock_000000230827xsmall-300x199.jpg" alt="" width="300" height="199" />Anyone who has kids in their home worries about how easy it is to access the seamier side of the Internet, even if by accident. Indeed, it is thrust upon us in our email in-boxes daily in the form of misspelled spam with links that only a fool would click.</p>
<p>Another issue altogether is the spam email that is carefully crafted to appear as if it has come from your bank, saying cheerfully &#8220;Your statement for May is available online, just click here to access!&#8221; &#8230; but whoever clicks will inevitably be providing their secrets to some ne&#8217;er-do-well in New Zealand who will promptly empty their accounts.</p>
<p>Here is a simple, quick, and free way to avoid phishing attacks as well as casual/accidental exposure to unwanted adult content.<span id="more-539"></span></p>
<p><strong>OpenDNS</strong></p>
<p>The service I am referring to is <a href="http://www.opendns.com/">OpenDNS</a>, a free domain name lookup service that you can use in lieu of your Internet Service Provider&#8217;s own DNS servers.</p>
<p>When your computer goes to a web site, the name of the web site must be converted to a numeric address, in much the same way that you use a telephone directory to look up a friend&#8217;s number.</p>
<p>This lookup service is typically provided by a server owned by your Internet Service Provider. The address to this server is automatically configured when your cable modem connects to the network the first time.</p>
<p>The way OpenDNS works is you change the Domain Name Server (DNS) setting in your router to now point to the OpenDNS servers instead of your ISP servers. By doing this, you have changed the default telephone directory used by your home network.</p>
<p><strong>A Phone book with the Bad Numbers Missing</strong></p>
<p>To take the phone book analogy further, imagine that in your new phone book, all of the phone numbers for shady businesses such as escort services and massage parlors have been replaced with a special number. When you dial that number, a pleasant older woman gives you a gentle scolding for trying to call such a business.</p>
<p>This is pretty much what happens with OpenDNS: when your browser asks for a page from www.naughtystuff.com, the OpenDNS server points you to a different place, a nice page from OpenDNS that says that the page is blocked and explains why.</p>
<p><strong>One fix for your Entire Network</strong></p>
<p>There are many options available for &#8220;net nanny&#8221; style software that can be installed on individual machines, such as the kids&#8217; machine. These features are also embedded in modern versions of Windows and OS X. But, what about all of the little portable devices that find themselves into kids&#8217; hands? How about their gaming consoles?</p>
<p>Since you configure OpenDNS at the network entry point to your home, the router, any device attached to your network is automatically covered.</p>
<p><strong>Customizable Blocking</strong></p>
<p>You can use OpenDNS without an account, just by pointing your router to their servers, but the real power comes when you register with them (for free) and make your own choices about what you want to see.</p>
<p>You can choose which parts of the Internet you don&#8217;t want to see using their online configuration tool. You can either use their &#8220;High/Moderate/Medium/Low/Minimal&#8221; options or you can pick and choose individual bits of stuff to allow or block.</p>
<p><img class="aligncenter size-full wp-image-545" src="http://paperjammed.com/wp-content/uploads/2009/05/20090526-opendns1.gif" alt="" width="583" height="589" /></p>
<p>Here&#8217;s a look at the categories available when you choose the custom blocking level:</p>
<p><img class="aligncenter size-full wp-image-546" src="http://paperjammed.com/wp-content/uploads/2009/05/20090526-opendns2.gif" alt="" width="393" height="337" /></p>
<p><strong>Basic Setup (about 20 minutes)</strong></p>
<ul>
<li><a href="https://www.opendns.com/start/">Configure your router</a> to use the OpenDNS servers for DNS lookups.</li>
<li>Create a free <a href="https://www.opendns.com/start/create_account/">OpenDNS account</a>.</li>
<li>Install their <a href="http://www.opendns.com/support/article/90">small updater program</a> on one machine on your network.</li>
<li>Log in to your <a href="https://www.opendns.com/dashboard/">OpenDNS Dashboard </a>on the web and configure your blocking settings to taste.</li>
</ul>
<p><strong>Why do you need the updater utility?</strong></p>
<p>In order to provide the custom blocking, the OpenDNS servers need to know your main IP address assigned by your Internet Server Provider. The desktop utility simply informs OpenDNS of your new IP address if it ever changes.</p>
<p><strong>What do users see if they go to a blocked page?</strong></p>
<p>They see a page that indicates the site that was blocked, along with a short reason and a link they can click if they want access to the page. If they click that link and fill out the short form, you will get an email from OpenDNS with the user&#8217;s request.</p>
<p>The remainder of the &#8220;blocked&#8221; page is a search form with some sponsored links.</p>
<p>You can customize the message as well as the image shown on the web page. When someone reaches a blocked page in my network, they are greeted by a picture of our calico cat, Roxy.</p>
<p><img class="aligncenter size-full wp-image-547" src="http://paperjammed.com/wp-content/uploads/2009/05/20090526-opendns3.gif" alt="" width="531" height="556" /></p>
<p><strong>Keeping the Honest People Honest</strong></p>
<p>This approach to blocking unwanted web sites is not a complete solution for keeping your kids from where they shouldn&#8217;t go; it is more like a simple padlock: it keeps the honest people honest. A determined individual can easily get around this product using various techniques, but they have to make a conscious effort to do so.</p>
<p>The real strength of OpenDNS is that it helps avoid accidental exposure to unwanted web content and phishing sites.</p>
]]></content:encoded>
			<wfw:commentRss>http://paperjammed.com/2009/05/26/a-cheap-and-cheerful-way-to-reduce-internet-surprises/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
	</channel>
</rss>
