Tuesday, July 26, 2011
Scanners Work In Vain
is "well you can just scan it in", referring to copying a book or a story.
My current response to the "you can just scan it in argument" is "like Hell you can!"
For the past couple of weeks I have been trying to scan in an a novella I wrote with Ernest Hogan, called "Obsidian Harvest" that we intend to publish as an ebook. On the basis of my experiences I'd say scanning digest-sized (paperback book size) pages is in fact extremely difficult. The real reason we're no overrun with scanned pirated versions of books is it's damned hard to do.
Part of the reason it took so long was that I was also in the final stages of publishing my new book titled "Shift Happens: The New E-Publishing Paradigm And What It Means For Writers." That was another "interesting" experience, albeit much less frustrating than the scanner.
My first attempt at scanning was with my quirky HP 8500 all-in-one printer-scanner-fax machine. It took me over an hour to scan in the 28 magazine pages. Then it went through OCR and the fun really began.
First off, the quality of the scan and resulting OCR was lousy. There was a mistake or two on nearly every line. Worse, whole sections of the story had simply not been picked up. At several places in the copy I was missing half a page or more. In short the scan was unusable.
Now I regularly use the scanner for contracts and such without the OCR and it has performed satisfactorily. So I assumed the OCR software that came with the machine (OEMed) from Iris wasn't up to the job.
My next step was to go to Nuance and order OmniPage 18, a highly recommended scanning and OCR package. After some hassles getting it installed, I tried it on the pdf file the IRIS software had created. This is a file of graphic images and most OCR packages will accept it.
The results were definitely better, but there were still a lot of mistakes. And of course the gaps in the document was still there.
Okay, recalling the famous dictum of John W. Campbell Jr.: "Always use the proper tool for the job. The proper tool to fix a television is a television repairman." I decided to take doczilla to a scanning service and have it scanned professionally.
The first place I tried was a regular commercial service. After some back and forth on the phone the guy at the service told me that basically they couldn't do it. Not only was my project to small physically for their scanners to feed, it was also too small a job for them. "Now if you had 2800 pages instead of 28 . . ." my informant told me.
After calling a couple of more services I got the same response. They all dealt in letter or legal sized pages printed on dead-white background and in quantities in the thousands.
Then I decided to try one of my local quick print places. They did indeed have a scanner for small quantities, but when I took it down there the answer was the same: They couldn't do it. Their problem was the paper size. It was too small to feed reliably through their sheet feeder.
In talking to the very helpful guy at the copy center, I found out why I was getting gaps in the scans. My all-in-one simply didn't have enough RAM to handle the job. When it ran out of RAM it quit OCR until it caught -- with no warning, naturally.
Okay, I've got one final shot. I dug the original manuscript out of my files and today I'll take it back to the copy center and see if they can do that. It's a little dog eared, but it is a clearly printed original. If that doesn't work, it's time to hire a typist.
The point of this long, rambling tale is that "just scanning it in" isn't easy, especially when you're dealing with digest-size or paperback book-size packages. While it's theoretically easy, the practice for some kinds of documents is a lot harder. It doesn't help that you've got to take the pages out of the original to get a clean scan.
There are a lot of things like this in our high-tech world where the gap between "we can do it" and "we can do it easily and routinely" is broad enough to defeat even semi-serious efforts to make it work. Just because we can do something doesn't mean it has been reduced to everyday practice and just because something is reduced to everyday practice in one field doesn't mean it will transfer easily to another, even closely related, field.
Saturday, September 22, 2007
WHACK THE GOPHER IV: THE FINAL CHAPTER
So, after parts I, II, and III of this series, the logical question is "what can we do about it?"
There is a lot we can do, but none of it is aimed at stopping people from posting copyrighted fiction on free sites. That ain't gonna happen, no matter how much the dinosaurs bellow in the swamps.
However that is a long, long way from saying copyrights are useless and authors can't expect to get paid for their work. Copyrights are not useless and authors can not only expect to get paid, most of the smart ones can expect to make more money in this brave new world than in the old.
The bad news is that genre fiction is going to be available for free on the internet. There is simply no way to stop it. SFWA can file all the DMCA takedown notices it wants. Individual authors can sue if they want. Crazed Luddite SFWA vice-presidents can rant about "netscabs" (on other people's pages because they're too technophobic to have one of their own). And none of it matters. People will continue to post copyrighted works for free. For every one you can shut down there will be two, or ten or 20 more.
The technology has simply moved beyond the kind of control publishers had a hundred years ago. Live with it.
(There is also going to be a sea change in the way genre fiction, especially science fiction and erotica, are going to be distributed in this country. This will probably mean the death of a lot of major publishers, and the transformation of the book store into something nearly unrecognizable. There are a lot of complex reasons for this and it really deserves a post of its own.)
The good news about all this is there is going to be a lot more genre fiction available to readers at a lot lower prices and as a class the authors are going to be a lot better compensated.
One way or another, most genre fiction is going to be sold over the internet. You'll either buy it directly on your own computer, or you'll get it in electronic or print form from something like a print on demand kiosk. You may even download and print books on your home system. That's not as big a job as you might think. To see what I mean DAGS "Blue Squirrel".
The Real Solution To Piracy
But while you can't stop free distribution you can stop is piracy for profit. Whether it's designer knock-offs, DVD movies or online fiction, if someone is paying for it, it's a lot easier to control.
"Stop" is a misnomer. You can't really stop piracy. But you can crack down on it hard enough to keep it down to an acceptable level.
The reason is that there's a money trail. If you can't locate the pirate through the work posted, you can locate them by following the money. That's why outfits like the RIAA have been a lot more successful at shutting down the commercial pirates than the file sharers.
The legitimate publisher has some advantages as well. One of the big ones is convenience. Why go to the trouble of searching out a pirate site, when you can go to someplace like Amazon and get everything you want in one place?
Today the incentive is money. Novels are expensive. When books are instantly available for, say, a dollar each, it becomes much less of incentive. In fact for most people it drops below the action threshold.
And yes, we can make novels available for a dollar or so each without significantly cutting into the author's royalties. In fact the late G. Harry Stine and I were in the process of forming just such an online publishing venture several years ago when Harry's untimely death ended the project. Our rather extensive calculations indicated that not only would the authors make as much money as they do now, but the profits to the publisher would be quite nice as well. Most of the cost of a book today is eaten up in an unwieldy system of production and distribution - but that's a subject for another post.
One of the reasons is that as cost goes down, sales go up. I firmly believe that low-cost books will sell enough to swamp the effects of pirate postings - which, as we saw in a previous post in this series, probably aren't resulting in that many lost sales anyway.
So, low price means high sales and less piracy. We've seen this happen before, specifically in the software industry. Back in the early 1980s Borland stood the software business on its head with Turbo Pascal, a full implementation of the Pascal programming language, complete with a nice little Integrated Development Environment (IDE) for the amazing price of $35. That was perhaps a tenth of what competing versions of Pascal were selling for and Borland sold a ton of copies.
What was interesting about this was that unlike most of its high-priced rivals, Turbo Pascal wasn't copy protected. Borland made no attempt to stop anyone from copying the disks. Phillipe Kahn, Borland's saxophone-playing president, figured that by keeping the price so low - for the time anyway - he removed most of the incentive to steal Turbo Pascal.
It's worth noting that except for games, most software companies have followed Kahn's lead. Software copy protection as a field isn't dead, but it is generally moribund.
Okay, that's not the whole story. And the way it isn't the whole story is interesting in itself. Kahn did one other thing with Turbo Pascal: He provided a neatly printed manual, which was (misnomer alert) perfect bound (/misnomer alert) like a paperback book. That meant that if you opened it flat to copy it, the spine cracked and the pages fell out. What Kahn did (and having met the guy I'm sure he did it deliberately) was to provide a way to add value to a legitimate purchase that the pirates couldn't match.
Changes in the product
But what about fiction? It doesn't need a manual, after all.
No it doesn't, but that's the other part of the change we're facing. The nature of what authors sell is going to change as well. Increasingly, it won't be just a book or a story, it will be membership in a community.
Successful works of genre fiction tend to build communities naturally. You can see the proof walking the halls of any science fiction convention. Savvy authors are going to use new media tools to capitalize on this to build not just sales, but a loyal following and to provide other products as well.
To see a very early example of this, stop by Baen Publishing's web site and pay special attention to the "1632" universe in all its ramifications. 1632 was originally the brainchild of Eric Flint, who also manages the Baen Free Library. It is the story of a West Virginia coal mining town suddenly plunked down in Germany at the height of the 30 Years War. It is alternate history at its finest and most fun and the original novel has been followed up by a sprawling collection of novels and short story collections. It has also spawned a very active fan base, many of which hang out at the Baen web site, especially in the forum called "Baen's Bar."
The development is still nascent, but with a little imagination it's easy to see how something like the 1632 phenomenon could provide even more value to the readers - value that a lot them would be willing to pay for.
Changes in the authors
The other thing this encourages is a completely different approach to writing genre fiction. While there will undoubtedly be authors who will continue to do things the way we do them now, the ones who will be most successful will be the ones who embrace the notion of community-building around their fiction.
In a sense this is a throwback to the 19th Century when popular authors like Twain and Dickens made more of their money on lecture tours than they did from the sales of their books. However the effect will be enhanced, amplified and zoomed up by the use of everything from web sites and blogs to YouTube videos and MySpace pages.
The author becomes the focus of community and the only thing the free posters will do is build that community further.
The world will be different, the demands on the authors will be different, but in many ways, both socially and financially, it will be a much more rewarding world for those who are willing to adapt.