Facebook friends can see my valid mobile phone number; my home and my office phone numbers stay the same, calls get forwarded "reasonably".
I will definitely return to Berlin on December, 23rd. I will return to Berlin for the weekend occasionally.
Sunday, November 27, 2011
Saturday, November 26, 2011
Journal: end of the line for syslog? - The H Open Source: News and Features
Journal: end of the line for syslog? - The H Open Source: News and Features: Lennart Poettering and Kay Sievers have developed a new logging system for Linux. The Journal daemon, which integrates with the service management daemon systemd, is intended as a replacement for syslog
New user interface for Firefox on Android - The H Open Source: News and Features
New user interface for Firefox on Android - The H Open Source: News and Features: Tonight's Nightly build of Firefox for Android will see the mobile browser gain a native user interface, replacing XUL. The change should make the browser load faster and reduce the amount of memory it requires
Netfilter developers working on NAT for ip6tables - The H Open Source: News and Features
Netfilter developers working on NAT for ip6tables - The H Open Source: News and Features: Netfilter developer Patrick McHardy has released patches for the ip6tables IPv6 packet filter under Linux; the patches allow the software to replace the address information in IPv6 data packets
Thursday, November 24, 2011
rsync, ionice, iotop
c't-Archiv, 23/2011, Seite 154: "Rsync zwingt Linux-Server in die Knie".
The above article (in German) is about using rsync together with ionice (like "nice" but for I/O load).
You can watch your processes' I/O load with iotop (like "top" but for I/O load).
The above article (in German) is about using rsync together with ionice (like "nice" but for I/O load).
You can watch your processes' I/O load with iotop (like "top" but for I/O load).
Labels:
rsync
Tuesday, November 22, 2011
Adium offers 1.5rcX updates, but they keep crashing on start up and restarting
So I keep re-installing 1.4.4, which causes me no headache.
Update 2011-12-24:
Still the same problem (incompatibility with the Skype plugin) up until 1.5b8.
Update 2012-02-01:
Still the same problem (incompatibility with the Skype plugin) up until 1.5rc2.
Update 2011-12-24:
Still the same problem (incompatibility with the Skype plugin) up until 1.5b8.
Update 2012-02-01:
Still the same problem (incompatibility with the Skype plugin) up until 1.5rc2.
Labels:
Adium,
IM clients,
instant messaging,
Skype
Friday, November 18, 2011
Thursday, November 17, 2011
page scraping / web harvesting / webscraping: are there really no serious and nice jobs out there? just these crappy ones?!?
I actually prefer to employ my competence for serious jobs.
I implement page scrapers using
I implement page scrapers using
- libcurl
- perl
- LiveHTTPHeaders and ieHTTPHeaders – I can almost automatically replay "a walk through the web" using their log files
More details on my competence: JHwis.
Labels:
page scraping,
web harvesting,
web scraping
Wednesday, November 16, 2011
Google details location services opt-out for Wi-Fi access point owners - The H Security: News and Features
Google details location services opt-out for Wi-Fi access point owners - The H Security: News and Features
Appending "_nomap" to ones SSID – what a sick idea!
Appending "_nomap" to ones SSID – what a sick idea!
Tuesday, November 15, 2011
XML::RSS - creates and updates RSS files - metacpan.org
XML::RSS - creates and updates RSS files - metacpan.org
Is there anything similar for Atom, the other XML-based Web syndication format?
Is there anything similar for Atom, the other XML-based Web syndication format?
Why Google should buy Barnes & Noble — Tech News and Analysis
Why Google should buy Barnes & Noble — Tech News and Analysis
And that's not because of the (Android) B&N vs Microsoft case …
And that's not because of the (Android) B&N vs Microsoft case …
Monday, November 14, 2011
rather satisfied with today's page scraping work
I did not experience much trouble, everything works just as expected. There could be more days like this one.
Labels:
JHwis,
page scraping
another page scraping task for the same client
It's getting funnier again, after I got more familiar again with my "old" tool set.
- At first I care for the forward navigation.
- Got the loop operating.
- But will the loop also stop?
- Yes, the loop stops successfully.
- Now for the content.
- No, reworking the loop first.
- Alright, the navigational part works fine.
- Now for the content.
- Content matched and split.
- CSV output is fine.
- TBD: RSS and Atom output.
- …
Labels:
JHwis,
page scraping
movie: Contagion (2011)
Contagion (2011)
sort of documentary drama. not easy entertainment.
I had dental treatment today, and I felt like things can always get improved, also in the medical area.
sort of documentary drama. not easy entertainment.
I had dental treatment today, and I felt like things can always get improved, also in the medical area.
Sunday, November 13, 2011
Saturday, November 12, 2011
procmail rules: changing somebody's status from A to junk
And blacklisted one "all" my phones as well.
This time it should last for a while.
This time it should last for a while.
Labels:
procmail
Friday, November 11, 2011
Thursday, November 10, 2011
Nicole Fritsche unter Beschuss | Immer mehr Linkspartei-Mitglieder distanzieren sich von der Politikerin, die zum Bespitzeln von Piraten aufrief
Nicole Fritsche unter Beschuss | Telepolis: "Immer mehr Linkspartei-Mitglieder distanzieren sich von der Politikerin, die zum Bespitzeln von Piraten aufrief"
'via Blog this'
'via Blog this'
Wednesday, November 9, 2011
Tuesday, November 8, 2011
Sunday, November 6, 2011
The Internet of Things comes to Eclipse - The H Open Source: News and Features
The Internet of Things comes to Eclipse - The H Open Source: News and Features: Together with IBM, Sierra Wireless and Eurotech, the Eclipse Foundation wants to improve communication between machines and is basing their pans around the MQTT protocol
Friday, November 4, 2011
I need nice samples of XPath and CSS expressions for HTML
Something like /html/body/p…
- o'Reilly's XPath an XPointer does not have a lot HTML examples
- search the web for "xpath html" – there are a lot of hita
- …
my MacBook made a funny noise, sounded a little like "dying hard disk"
how bad! I hate that kind of noise. disks aren't really expensive, but will I be able to copy all the contents of the old disk to the new one, before it's really dead? whatever variant, they all cost far too much time.
Labels:
MacBook
"… is now following you on Twitter" – why do I hate these notifications?
I hate them, because 99% of these new followers simply expect you to follow them back, and their names just don't sound intriguing, so I will never follow them back.
Labels:
Twitter
my new page scraping assignment – getting familiar again with my toolkit
For my new page scraping assignment I thought for a while of trying a much more modern approach.
That actually kept me from really starting it for quite a couple of weeks now, because it seemed so very tedious and I thought, I don't have like 3 shots for it.
This week I thought about going with my own old approach and about making use of the state-of-the-art technology at a (slightly) later stage. That should work.
So where is my software and where is my documentation?
- I remember, I had left a link here at my Aleph-Soft.com website
- that leads me to my slightly more extensive dedicated article
- of course, while I read it, I switch to the sources of that article, so that I can improve the article "en passent"; OMG: running that DocBook website toolchain even works after at least a year or so! I'm amazed. well, not updating software does have some positive side-effects.
- does LiveHTTPHeaders still work with my current Firefox? LiveHTTPHeaders is one of the reasons I still keep my Firefox updated, although I chose Chromium as my main browser on all platforms (*** bookmark ***)
- what about its cousin ieHTTPHeaders for IE? WTF, where does it actually live and get maintained? alright, I assume Jonas Blunck is the creator and maintainer
- is there anything like *HTTPHeaders for Chrome/Chromium? that would be nice; I would have to make my respective tool read its logfile then
- creating a perl script from LiveHTTPHeaders's log file still works
- integrated that perl script into my framework for that kind of stuff
- download the root HTML page, parsing it, extracting the 1st few bits of information wanted
- download the 1st linked page; the navigation doesn't go further / deeper than this
- TBD: extract the information details from that linked page; CAVEAT: there is an optional intermediate ("region") level within that page
- …
(This article is getting extended and updated these days in early November 2011.)
Labels:
JHwis,
page scraping
at its best, "philo-Semitism" led a narrow part of British society to favor the resettlement of Jews in their homeland
British Philo-Semitism, Once and Future » Main Feature » Jewish Ideas Daily
I was tempted to mention, that the link refers to a rather interesting article, but for what other reason did it find its way here?
I was tempted to mention, that the link refers to a rather interesting article, but for what other reason did it find its way here?
Labels:
Israel,
Judaism,
philosemitism
Thursday, November 3, 2011
Wednesday, November 2, 2011
Mercurial 2.0 adds large binary file support - The H Open Source: News and Features
Mercurial 2.0 adds large binary file support - The H Open Source: News and Features: Simpler backporting of individual changes and the ability to handle large binary files efficiently are promised by the latest edition of the distributed revision control tool
Hortonworks launches Apache Hadoop based platform - The H Open Source: News and Features
Hortonworks launches Apache Hadoop based platform - The H Open Source: News and Features: Yahoo spinoff Hortonworks has announced Hortonworks Data Platform, a 100% Apache-licensed Hadoop-based system for "big data" and other large scale distributed computing
apparently SPARQL is not going help me with my web-harvesting / page-scraping task
What a pity! Would have been far too elegant. But if everybody makes use of RDF, stealing is just too easy.
Labels:
page scraping,
RDF,
SPARQL,
web harvesting
Tuesday, November 1, 2011
Dennis Ritchie's legacy of elegantly useful tools - O'Reilly Radar
Dennis Ritchie's legacy of elegantly useful tools - O'Reilly Radar: "We need more people who share Dennis Ritchie's spirit."
Labels:
UNIX,
unix_utilities
Learning SPARQL - O'Reilly Media / came up another time
Learning SPARQL - O'Reilly Media: "Querying and Updating with SPARQL 1.1"
An update alert came in from O'Reilly.
The hope grows, that SPARQL may be the elegant solution to my current web harvesting / page scraping task.
Update 2011-11-02
An update alert came in from O'Reilly.
The hope grows, that SPARQL may be the elegant solution to my current web harvesting / page scraping task.
Update 2011-11-02
- Started reading the book
- installing the software and the samples
- running a few samples
- looks as if SPARQL cannot query HTML resp. XML not formatted resp. marked up specifically (frustration here on my side)
Labels:
OReilly,
page scraping,
SPARQL,
web harvesting,
web scraping,
XML
Subscribe to:
Posts (Atom)