<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Carefully inspect all scanned documents</title>
	<atom:link href="http://paperjammed.com/2009/02/09/carefully-inspect-all-scanned-documents/feed/" rel="self" type="application/rss+xml" />
	<link>http://paperjammed.com/2009/02/09/carefully-inspect-all-scanned-documents/</link>
	<description>Has paper taken over your life?</description>
	<lastBuildDate>Thu, 01 Jul 2010 19:23:15 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
	<item>
		<title>By: Tad</title>
		<link>http://paperjammed.com/2009/02/09/carefully-inspect-all-scanned-documents/comment-page-1/#comment-6064</link>
		<dc:creator>Tad</dc:creator>
		<pubDate>Fri, 01 Jan 2010 01:21:28 +0000</pubDate>
		<guid isPermaLink="false">http://paperjammed.com/?p=126#comment-6064</guid>
		<description>Hi Sean,

The direction of copy/paste highlighting is all about how the PDF document is generated, and this is going to be a direct result of the OCR process.
The scanner typically generates TIFF or PDF with raw images (in your case 300dpi). The OCR software then processes those PDF documents and generates new documents that have the text layered on top of the image.
This way, you still see the original image, while the text can be selected and copied.

Something in this process is not set correctly; the scanner and the OCR software are somehow out of sync. If I understand correctly, the invisible OCR text has been laid out incorrectly over the visible image, or possibly the text flow has been configured in a columnar fashion.

This is definitely a OCR software configuration issue.

Your scanner seems to be shipped with OmniPage Pro as well as Visioneer OneTouch with Kofax VRS. I&#039;m not sure if the latter duo provides OCR, but OmniPage Pro definitely does so. Which of these products performs the OCR in your workflow?

It might be possible that there is a setting that configures how spreadsheets are treated, columnar or row flow. Perhaps it is as simple as telling OmniPage Pro to work in Spreadsheet mode (an option I see in the online docs).</description>
		<content:encoded><![CDATA[<p>Hi Sean,</p>
<p>The direction of copy/paste highlighting is all about how the PDF document is generated, and this is going to be a direct result of the OCR process.<br />
The scanner typically generates TIFF or PDF with raw images (in your case 300dpi). The OCR software then processes those PDF documents and generates new documents that have the text layered on top of the image.<br />
This way, you still see the original image, while the text can be selected and copied.</p>
<p>Something in this process is not set correctly; the scanner and the OCR software are somehow out of sync. If I understand correctly, the invisible OCR text has been laid out incorrectly over the visible image, or possibly the text flow has been configured in a columnar fashion.</p>
<p>This is definitely a OCR software configuration issue.</p>
<p>Your scanner seems to be shipped with OmniPage Pro as well as Visioneer OneTouch with Kofax VRS. I&#8217;m not sure if the latter duo provides OCR, but OmniPage Pro definitely does so. Which of these products performs the OCR in your workflow?</p>
<p>It might be possible that there is a setting that configures how spreadsheets are treated, columnar or row flow. Perhaps it is as simple as telling OmniPage Pro to work in Spreadsheet mode (an option I see in the online docs).</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Sean</title>
		<link>http://paperjammed.com/2009/02/09/carefully-inspect-all-scanned-documents/comment-page-1/#comment-5753</link>
		<dc:creator>Sean</dc:creator>
		<pubDate>Wed, 23 Dec 2009 10:57:15 +0000</pubDate>
		<guid isPermaLink="false">http://paperjammed.com/?p=126#comment-5753</guid>
		<description>Hi there! Found your site whilst googling an issue I am having. I read through this article and definitely picked up some great tips. However, I was wondering if you ever came across this particular issue:

Using a Xerox Documate 632 scanner with feed or glass scanning, I scan a series (or just one page) of horizontally aligned (i.e. landscape pages). These are basically Excel tables filled with text - addresses, names etc... The pages are read horizontally i.e. left to right.

Now the scanner does OCR at 300dpi. However it is OCR reading the page from top to bottom. In other words when I try and highlight text in the document it highlights in a downwards fashion, whereas the text should be highlighting as it reads - left to right. 

Have you seen anything like this before? If so, do you have any recommendations to ensure the scan does OCR in the correct direction?

Any advice would be greatly appreciated.

Thanks!</description>
		<content:encoded><![CDATA[<p>Hi there! Found your site whilst googling an issue I am having. I read through this article and definitely picked up some great tips. However, I was wondering if you ever came across this particular issue:</p>
<p>Using a Xerox Documate 632 scanner with feed or glass scanning, I scan a series (or just one page) of horizontally aligned (i.e. landscape pages). These are basically Excel tables filled with text &#8211; addresses, names etc&#8230; The pages are read horizontally i.e. left to right.</p>
<p>Now the scanner does OCR at 300dpi. However it is OCR reading the page from top to bottom. In other words when I try and highlight text in the document it highlights in a downwards fashion, whereas the text should be highlighting as it reads &#8211; left to right. </p>
<p>Have you seen anything like this before? If so, do you have any recommendations to ensure the scan does OCR in the correct direction?</p>
<p>Any advice would be greatly appreciated.</p>
<p>Thanks!</p>
]]></content:encoded>
	</item>
</channel>
</rss>
