Wednesday 21 June 2006

Scannerless scans R go

If you occasionally need to digitise paper copies of documents and own a digital camera, but not a scanner, scanR might appeal to you. It takes a snapshot of a document or whiteboard, cleans it up, re-aligns it and generally makes it more legible for screen reading. To this end it does an admirable job.

As I was more interested in scanR's ability to decipher and transform the characters found within an image to make them more accessible I decided I'd test the service using a pre-scanned article from a magazine rather than a photograph. The results were less than impressive. Highlighting specific parts of text within the generated PDF was very hit and miss, and the chunks I attempted to copy and paste in no way corresponded to my original selection. It is possible to extract all the text from a document and paste that into an editor, but a lot of painstaking massage would be required before you could do anything useful with it.

To be fair to them, nowhere on the scanR web site do they rave about their software's ability to perform optical character recognition beyond the functional level required to index documents by keywords. Nevertheless you'd expect this technology to go with the territory. It would be like buying a kettle only to discover that boiling water isn't an option with your chosen model.

It may not do proper OCR (or boil water for that matter), but you can't fault it for box-ticking, Web 2.0 zeitgeist. With its abbreviated/missing vowel chic scanR can't fail to be a hit with the hip txt-speak generation.

1 comments:

Trias said...

I would have thought OCR would be easy to integrate or at least offer as an additional service.