Rediscovering Dickens

A chronicle of the transcription of 20 issues of Household Words by Charles Dickens. http://household.umkc.edu

My Photo
Name:
Location: Kansas City, Missouri, United States

I'm working on a pet project to digitize all of the issues of Charles Dickens's weekly magazine, Household Words, that contained portions of his novel Hard Times. Since OCR software is expensive, I'm transcribing all of 20 issues by hand. Since I am actually interested in what I am typing (and am therefore reading as I go), and I not the speediest of typists, this will take me a little while. This blog will chronicle my progress and my thoughts about the project and its content along the way. Why should you care? If you are at all interested in how popular culture evolves, how the middle class came to be, and how literature is affected within and without its context, you should read on. If you couldn't care less of such things, then you might want to go elsewhere. Thanks for visiting - I hope you will return. - Lynn

Monday, September 18, 2006

"The Workhouse often evokes the grim world of Oliver Twist, but its story is also a fascinating mix of social history, politics, economics and architecture.
This site, www.workhouses.org.uk, is dedicated to the workhouse — its buildings, inmates, staff and administrators, even its poets..."

Friday, September 08, 2006

Remember when I mentioned that Dickens uses filler articles? Looking over the titles of the upcoming articles, it looks like I have some more excellent reading ahead (cough, cough, ahem). Specifically, I am referring to "The Art of Boreing." Need I say more?

I've recently put up images of issues 215, 216, and 217. I hate that part. It's tedious. It's boring. It makes me cry tears that, strangely enough, taste like sadness and old pennies. There are 12 more issues to bring online, and I don't look forward to the mindnumbing process that is that effort. Let me explain:

I have scans saved as .tif files of every page from every issue of Household Words that contains Hard Times. These scans have extra white space (heretofore known as "crap") around the edges. So for every page, I must open the image file in Adobe Photoshop, cut out the crap, resize the image (because the originals are HUGE) and resave the image as a .jpg file so you can see it with your web browser. When I open more than a couple of these ginormous files in Photoshop, my computer cries and decides to punish me for a few minutes by refusing to do anything.

Once the images are converted to crap-free .jpg files, I copy one issue's worth up to the web server to begin the stupendafabulous process of putting the issues together so you can "flip through" them, so to speak. This is the part I hate the most, and the part that I hope to automate for the remaining 12 issues. Right now, I change every link by hand. There are 20 issues, with 24 pages each. Each page has 6 links to modify, and one picture to resize. That means that there are potentially 3,360 ways for me to screw up, lose my place and wish to run repetedly into a wall.

I'm just beginning to work on the automated process and hopefully within the next couple of weeks I will have built a better mousetrap. Doing this will let me focus on transcription, and commentary, and looking for and finding connections between the novel and the magazine, which is the real point of all of this anyway.