Dumping Your Safari History With Ruby (Apple’s Curious Epoch)

January 7th, 2009  |  Published in ruby  |  2 Comments

This is sort of weird, but it turned out o.k. in the end:

Safari stores its history in ~/Library/Safari/History.plist. It keeps the lastVisitedDate of each entry in Apple’s CFAbsoluteTime format, which is a float, and which represents, according to a message on carbon-dev, a count of seconds since midnight on January 1, 2001.

I don’t know what the significance is of that date, and I don’t know near enough about the traditional Unix epoch to know why it might or might not serve (insert tentative mumbling about the y2k38 problem from the guy who barely made it out of Algebra I in high school … Ed? You always have something to say about this stuff) but at least it was easy to deal with once I knew what CFAbsoluteTime started counting at:


#!/usr/bin/ruby

require "rubygems"

require "osx/plist"



MacEpoch = Time.utc(2001,"jan",1,0,0,0)

plist = "/Users/mph/Library/Safari/History.plist"

history = OSX::PropertyList.load(File.read(plist), format="false")



history[0]["WebHistoryDates"].each do |h|

     date =  MacEpoch + h["lastVisitedDate"].to_f 

     puts "#{h["title"]} --  #{date}"

end



Another weird thing: Using titles is o.k., but I’d rather parse URLs so I can track the attention a domain is getting. When osx/plist coughs up the hash with all the history stuff in it, the first key–the one with the url–is a null value. I can get at it with h[“”], but it seems kind of off, and it seems to render osx/plist unable to dump the history back out to XML for less clunky parsing. A quick peek at the plist with plisteditor, though, and the key is, indeed, null.

So all that’s the nub of a further exploration of what’s getting my attentional resources.

Attentional Followup

Yesterday I dropped a bunch of stuff from NetNewsWire, and it was a pleasure to wake up to an immensely less cluttered morning scan. One thing all my poking around in the attentional numbers revealed was that I tend to read a lot of the stuff I’d prefer to read the least. Dumping a few dozen feeds put the highest attentional score for any single feed to 81, where it had been 150 for the top feed prior to all the culling. Most of the remaining feeds score well below 50.

It seemed a little counterintuitive to get rid of the things I read the most — they’re surely what I’m most interested in, right? — but after a little reflection I realized they’re the things I’m most eager to distract myself with, and there’s a difference between those categories.

By way of further exploration, I think I’m going to tweak my script a little so it can take advantage of another NetNewsWire library feature, which is the “scripted attention score”:

To affect the calculated attention score, you can change the scripted attention score. The scripted attention score is a component of the calculated attention score (it’s added).

I can organize feeds into categorized folders, add a scripted attention score value to each of the folders based on my “real” interest in a given category, and start using that to refine cull recommendations from time to time.

Responses

  1. Graphing NetNewsWire With the Google Chart API :: dot unplanned says:

    January 9th, 2009 at 10:19 am (#)

    […] And here it is doing the Safari views-per-day history: […]

  2. We Did It That Way for a Reason :: dot unplanned says:

    June 3rd, 2010 at 12:39 am (#)

    […] my NetNewsWire use with Ruby with Ruby. And there’s also an entry about how easy it is to dump all of Safari’s history with Ruby […]

Leave a Response

© Michael Hall, licensed under a Creative Commons Attribution-ShareAlike 3.0 United States license.