Wednesday, May 25, 2011

Learning Rails 3: Live Edition - O'Reilly Media

Learning Rails 3: Live Edition - O'Reilly Media

O'Reilly Media book: Sinatra: Up and Running

Sinatra: Up and Running:

Take advantage of Sinatra, the Ruby-based web application library and domain-specific language used by GitHub, LinkedIn, Engine Yard, and other prominent organizations. With this concise book, you will quickly gain working knowledge of Sinatra and its minimalist approach to building both standalone and modular web applications.


Saturday, May 21, 2011

/tmp/docview500 ???

I found that directory on my openSUSE Linux computer, with really a lot of files, whose names were derived from files in my home directory. The time stamps were quite recent. I have no idea, which software created that directory (tree). It would be a relief to know it was KDE, but actually why should it?

Friday, May 20, 2011

"A picture is worth a thousand words"

A picture is worth a thousand words - Wikipedia, the free encyclopedia

Unicode (UTF-8) Test – looks a little like the ASCII table from the old times

Unicode (UTF-8) Test

extracting infos from a rather detailed PDF (from a software developer's point of view)

If I access PDF, I rather read the XML created by "pdfthtml -xml" for a PDF file. Although there are features, that I miss with XML::Simple, I find that module rather convenient.

Think of a pay slip as PDF. It has quite a regular structure. (Of course, you might also want to receive an XML representation of it directly from the salary software, but that's another issue. In this very case this looked like rather hard to achieve.)
There are labels and there are values. I want to access values by their labels. Therefore I need a specification describing, where the value belonging to a specific label is located relatively. I do this by giving a relative rectangular range / region. All text strings provided by "pdftohmtl -xml" (i.e. the text elements) get stored into a matrix (X×Y). So far there were no big obstacles accessing the value for a label by scanning the matrix within that relative rectangular region.
I actually and also usually don't want and need to specify, where the label is located on the page. Why would you want to specify that, as long as it's not necessary?
But certain labels appear more than once. I add the absolute rectangular region of the label, in case that is needed. Of course, this spec. is as terse as possible. A PDF page has its origin at the upper left corner (you do know that). So if the label is just above y=500, you neither need to give the left upper corner of the resp. rectangular region nor the lower right corner. This makes the label/value spec. just as verbose as needed.
(Right, I know a picture would help: A picture is worth a thousand words.)

My software is implemented in Perl, and so far the label/value specs are done programmatically. Of course, I would like to have a spec as XML or as a DSL, but I am not there yet.

To be continued …

Blogger Buzz: Add a virtual tip jar to your blog

Blogger Buzz: Add a virtual tip jar to your blog


Google Checkout as competition to Flattr.
I have not made reasonable amounts of money through Flattr yet; I have no idea, how it works for real writers; question is, whether Google Checkout will make it easier to earn money through sharing "text" on the web.

Here is the Google Checkout Blog.

The article mentioned above is in fact a promotion link between Google's Blogger department and Google's Checkout department.

Tuesday, May 17, 2011

how does Skype show you to yourself?

Skype on Mac OS X shows you, as if you look into a mirror – I find that rather "natural" and "usual", just what I prefer.
Skype on Windows seems to show you "the other way round" – like other people see you.

If you are having a video conversation via Skype, you will occasionally look whether your face is still rather centered within the shown window. If you move or turn left and right and if that's not shown like in a mirror, I find that weird.

XML text nodes in Ruby and Perl

the Ruby REXML Tutorial deals with this.

in Perl …

Monday, May 16, 2011

O'Reilly Media book: Perl Cookbook

Perl Cookbook, Second Edition - O'Reilly Media

Chapter 22 XML

  • 22.9: Reading and Writing RSS Files -> meta feeds
useful scripts in their example code at http://examples.oreilly.com/9781565922433/:
  • oreilly--Perl_Cookbook--code/ch09/symirror – build spectral forest of symlinks

"meta feeds" bookmark collection

Ruby Cookbook - O'Reilly Media

Ruby Cookbook - O'Reilly Media

11.16 A Simple Feed Aggregator -> meta feeds

book: Webbots, Spiders, and Screen Scrapers, by Michael Schrenk

Official Web Site: Webbots, Spiders, and Screen Scrapers, by Michael Schrenk

Chapter 12 features aggregation "webbots".