Archive for 2010

Split Reading

Regarding my July tech purge, about which I wrote:

“So today I’m going to experiment with eliminating all of it from my RSS reader and Google News page except the professionally mandated stuff (which is a pretty narrow field) and maybe a few security update sites. Consumer tech, though? Gone. If it’s not something I use, I don’t want to read about it. If it is something I use, I don’t need a columnist telling me what to think about it. If I need to know something, I’ll just go looking. Passively setting up trawls and then sifting through whatever gets caught up in a few keyword searches or catches the fancy of tech bloggers is for the birds.”

I kept to that for a while — right up until around the end of November — then started discovering some consumer tech stuff was creeping back in. Less gadget/hardware (though there was a little of that) and more Web/online.

The default place to put that kind of thing is in my RSS reader, so that’s where that stuff went as it found its way back in. But the list of things I want to be distracted by during the day is still pretty narrow, so having those things in my RSS reader wasn’t a good place for them.

So I ended up taking a lot of those things right back out, then hunting them down in a form I could consume via Flipboard, which is a really lovely app for the iPad that takes Twitter (and more recently RSS) feeds and wraps them in a magazine-like format you can flip through. (Follow the link and look at the video for an idea of how it works.)

For the professional/don’t mind seeing come by during the day stuff, I’ve got a desktop RSS reader with companion apps for the iPhone and iPad. For the stuff I don’t want to catch my eye as easily during the work day, I’ve got Flipboard. When I call it a day in my office, the work stuff pretty much stays in the office, but the things that are more personal are there on the iPad, which is the only computing device I touch after 6 p.m.

One of the nice things about Flipboard is that it will use Twitter lists as well as vanilla Twitter feeds, and that’s caused me to start using Twitter more and more the way everyone else has been using it for a while now: Rather than following hundreds and hundreds of people and institutions, I keep my follow list sort of slim, but have added bunches and bunches of sources to assorted Twitter lists that I subscribe to with Flipboard. I treat Flipboard as a mostly optional reading experience. Something more to be browsed than closely read. It doesn’t nag about how many unread items I have, which is fine because nothing in there (with one set of exceptions) is anything I really need to read. Since it’s easy to flip past stuff that’s not interesting, I feel a little better disposed to outlets that were annoying the hell out of me when they were mixed in with the stuff I really need to think about during the day.

I could use the many canned lists that are (ugh) curated by assorted Web luminaries, but I’ve found those lists are much better as starting points to be raided for good sources and stripped of assorted a-lister cronies and other annoyances. Cruising Twitter profiles for “more like this” lists is pretty fruitful, too.

I also use Twitter lists for the tweetsonae of friends who’ve got commercial or promotional feeds that run parallel to their personal Twitter feeds. It’s a good way to keep out the double (and triple) posts, continue to be open to retweeting or absorbing the promotional stuff (as a good friend should be), and cut down on the workday distractions.

iOS/Mac Cross Pollination

I’ve spent my share of time wishing that favorite Mac apps would make some sort of appearance as iOS apps, but I hadn’t really considered going the other direction: Developers porting successful iOS apps to MacOS. With the upcoming Mac app store, it makes sense that developers who’re familiar with the Apple app process might think about ways to do that.

Reeder for Mac

I use Reeder on my iPhone and iPad every day. It’s an outstanding RSS reader, both in terms of simple functionality and raw visual appeal. On my Mac, however, I’ve been using NetNewsWire for years.

When Reeder’s developer announced a Mac version of his iOS app, I was feeling more politely skeptical than anything: NetNewsWire is loaded with features and I’ve been writing AppleScript (or rb-appscript) for it for years. So I downloaded Reeder for Mac’s first beta release and fired it up.

It doesn’t offer much of anything its iOS versions don’t, and even does something I typically don’t like, which is provide textured backgrounds and a non-standard toolbar instead of simpler and more “Mac-like” interface elements. Like its iOS cousin (and NNW), it syncs with Google reader, which is a great feature for any RSS app.

I expected I’d probably switch back to NNW after an hour or two, but that hasn’t been the case: It’s true that Reeder doesn’t offer a number of the features NNW does, but it’s also true that I read a lot more feeds on my iPhone or iPad than I do on my Mac, and Reeder’s sharing menu (which supports services I use like Instapaper, Pinboard) is perfectly adequate. It causes me to look at some of the scripts I’ve written and the functionality I thought represented some baseline standard for an RSS reader and wonder if I haven’t been very, very silly.

So in this case the simplicity of iOS has shaped my expectations for a desktop app. I’m sure there are classes of apps where that probably won’t ever happen — mail clients and text editors spring to mind — but it makes me wonder what else might work better when inspired by the more stripped down aesthetic of iOS apps than the “there’s a menu item for that” approach of desktop apps, and how many features I thought I needed that will seem sort of silly when I can compare apps coming from the two platforms side by side.

A Brief History of Mail Services

1999: I need a mail server that syncs up between my home and office computers so I don’t have to continually re-read mail I saw at one or the other. I set up the UWash IMAP server on my home server and I’m good.

2000: Still using IMAP, but I change email clients a few times along the way as more and more Linux clients add support for IMAP. That doesn’t matter, because even if IMAP behave a little differently from server to server, and even if some of the client implementations aren’t awesome, they all work together.

2004: What? Still IMAP. I’ve switched to Macs by now, but I’ve been able to use Entourage, Mail.app, mutt, probably a few things I’ve forgotten. It’s not an issue because IMAP just works with all of them.

2009: The new employer has an Exchange server. My choices are: Entourage, Outlook or the Outlook Web edition. Nobody turned on IMAP (at least not for someone with my pull), so I’m out of luck with any mail client I’ve used prior. I’d like to use Snow Leopard’s Mail app, which touted support for Exchange servers, but it only works with Exchange Server 2007. If you want to use Entourage with ES, you need the Pro edition of MS Office. Already own Home edition? You can’t upgrade for the difference between the two packages: You have to buy the whole thing again. Well played, Microsoft. Well played, indeed. (I do not, by the way, submit to this particular bit of extortion.)

2010: Say … still using IMAP for everything but work mail, and I notice that Microsoft is about to release Office for Mac 2011. This time, instead of crummy Entourage it has Outlook. Only $199? Well … if it’ll let me hook up to that Exchange Server 2003 install at work, I’ll consider it. What’s that, Microsoft? Outlook 2011 will only support Exchange Server 2007? Pity.

Fortunately for me, there’s Davmail. What’s Davmail? Just a desktop proxy that can talk to Exchange Server 2003 and in turn allow you to use any old mail client with it because, as it turns out, Davmail’s developers have had the same experience with IMAP I did: It just works with just about everything.

Thanks for nothing, Microsoft.

And thanks for nothing, Apple, for that matter: My iPhone running iOS 4 is able to talk to Exchange Server 2003 to the point I can even acknowledge, cancel or modify calendar events. Snow Leopard? Exchange Server 2007 or nothing.

Ballpark Digest Relaunched

Ballpark Digest

The new Ballpark Digest went live today.

This job was pretty similar to the Arena Digest relaunch, and involved a lot of the same tasks.

There were over 2,500 legacy articles that needed to be imported this time around, and preserving their search engine placement was a little more important because the site was pretty well indexed. I had to pick up one new trick, too:

Not having any legacy i.d. numbers to work with during the import, I ended up having to figure out the legacy URLs on my own. I knew the article were, at least, in the proper order, so at the beginning of the import data, the second article in the list had an i.d. of “2″, the tenth in the list had an i.d. of “10″, etc. Unfortunately, that 1-1 mapping broke down the first time an article was published then taken down, because my import data didn’t note missing articles. By the time I got to the 2,000th article, the relationship between import row and legacy i.d. was off by a pretty substantial amount.

I grabbed the RBing gem and automated the process of searching by article title and using the URL I got back to figure out the article’s old i.d. and URL. That didn’t work perfectly, because there were some gaps in Bing’s indexation of the site. So I had to write a second script that ran down the list of articles and looked at each i.d., applying the following algorithm:

  • If the article i.d. was one greater than the i.d. of the article before it, and one less than the i.d. of the article after it, I assumed it was o.k.

  • If the article i.d. didn’t match the above criteria, but the i.d. of the article after it was two greater than the i.d. of the article before it, I assumed the real page for that article had failed to be indexed, and I assigned it an i.d between those of the articles on either side of the sequence.

  • If the article i.d. didn’t meet either of those criteria, I flagged it for review.

Most of the time, the ones that were flagged for review were part of a streak of articles that hadn’t been indexed properly to begin with, so the best result Bing could produce was an easily recognizable archive page URL. It was easy to consult the list and see sequences like this:

  • 453

  • 454

  • archive URL

  • archive URL

  • archive URL

  • archive URL

  • 459

Clearly the third through sixth articles in the list had to be 455, 456, 457 and 458. I felt a little guilty for not taking the time to work out a way to do that programatically, but there were only three or four sequences like that so I sucked it up. There were also a few sequences where there was no discerning the proper sequence, but that list totaled fewer than 15.

Once all the i.d.’s were straightened out, I wrote a script to generate the redirects, and plopped it into the site .htaccess.

Oregon State Fair

DSC_1359

We spent part of the day at the Oregon State Fair in Salem.

Best part: It was pretty quiet on the cable car that ran over the fairgrounds. You could look down into the grease disposal areas kept behind the food stands as well as gain an appreciation for the rich diversity of the Oregon airbrushed t-shirt industry. With a good lens, you could also see what people were eating.

DSC_1327

Worst part: The live alligator exhibit was one of the most depressing things I’ve ever seen in my life. There was an alligator, it was alive, and it was kept in a pool of water where it had just enough room to be alive in a pool of water. Horrible. Whoever allowed that should be kept in a bathtub full of dirty water for a week while people walk by and gawk. Then fired.

DSC_1393

You can spend a lot on rides. We did.

DSC_1382

Two Fixes

Boy did I get sick of having to set this every time I ran Panopticon:

    look_back = 3.days

Some days it was “2.days,” some days it was “1.day,” some days it was 3.days. So:

    tsf  = "/Users/mike/.panopticon_stamp"

    timestamp = File.stat(tsf).mtime

    look_back = Time.now - timestamp

then when all is done:

    \`touch #{tsf}\`

It wouldn’t have even been a problem if I’d just checked some items as “done,” but when I accidentally create an item, or decide an item doesn’t need to be read or processed or whatever, I just want it gone and I don’t want it sitting in a log claiming to be a thing I did or read. Because I didn’t. But when you delete an item, it’s gone and Panopticon can’t find it anywhere and will recreate it, which meant I was being haunted by zombie tasks that I most pointedly did not want to deal with anymore. Now I won’t be.

But all that put the next thing in mind: It’d be pretty nice to have things I mark as “done” in Panopticon get unmarked/unflagged/unstarred or whatever in their native app. So a starred item in Google Reader stops being starred, a flagged message gets unflagged, a bookmark in delicious gets tagged as “read,” etc.

Fixing Sampled Reporting

Probably no time for that this week. I spent a lot of time fiddling around with some problems introduced by sampling in Google Analytics.

The brief version: If you make queries against the Analytics API that involve more than 500k events, you start getting sampled data. The article specifically mentions pulling reports for long periods of time.I’ve been pulling reports for a number of sites that do well north of 500k visits per month, so when I started pulling queries for periods of 60 or 90 days long I was most certainly getting back sampled data.

When I started trying to do really simple reporting about how page views changed month-over-month, I started seeing articles that somehow had fewer total page views when they were 60 days old than when they were 30 days old. Changing my approach to gathering page views from “single long period” to “consecutive shorter periods” cleared the problem up. Rather than pulling queries for a period like “from the date of publication to 60 days after the date of publication,” you’re a lot better off pulling a pair of queries: “from the date of publication to 30 days after the date of publication,” and “from 60 days after the date of publication to 29 days before that date” then adding them up. Unless the site is doing more than 500k visits a month, sampling is less likely to get you.

Sequel Pro

Sequel Pro

Have I mentioned Sequel Pro? If I haven’t, I should have.

It’s a Mac GUI for MySQL that’s good if you’re like me and don’t really like dealing with MySQL. It doesn’t cost anything, either.

A few things I like:

  • Easy record editing. You can double-click and make a quick change on the spot.

  • Easy export and import of selected or all tables in a database. When you’re in the “noodling around” phase of writing something that touches the db, it’s really easy to back in and out of changes quickly.

  • Simple view filters. If you’re not fond of a lot of typing just to find a few records, you can filter records from the data view. It’s not complex (you can only filter on a single column), but there are plenty of filters: contains, greater/less than, earlier/later than, etc.

  • Real queries. If the view filters aren’t complex enough, you can write real queries. It saves any queries you write in a history drop-down, and you can save them to a “favorites” list with a human-readable label. They’re syntax-highlighted, too.

It makes setting up a database pretty easy, too: auto-completion of column types and a little bit of guidance in setting up indexes.

There are some features I haven’t used yet (or as much), that also look pretty neat:

  • Export of the database structure to a GraphViz document.

  • Syntax export for table creation.

In some ways, it’s the best kind of training wheel software: It smooths out a lot of console-jockeying stuff, but doesn’t completely separate the user from what’s going on underneath. You still need to have some idea about what you’re trying to accomplish, but you spend less time worrying about the peculiarities of a particular interface style and can dip into the deeper waters as you learn.

It also makes it a lot easier to add database functionality to scripts, just because it makes managing a database so much easier. If you’re like me and tend to think in terms of ActiveRecord first, whatever the db backend you’re using second, Sequel Pro makes it easier to get to the good stuff faster.

Let’s Hear It for Lossy

So, I’ve got this Excel spreadsheet that represents a dump of over 2,000 articles from an old CMS. Using the same general workflow I used before, I’m importing the spreadsheet into a local MySQL database using the Ruby spreadsheet gem. The importer I wrote just reads line by line, creates a new ActiveRecord object then saves it.

Two thousand lines of spreadsheet — two thousand lines that include news-length content in each of their lines — is enough to make Excel (on the Mac) fall to its knees whining. Worse, after a few changes and saves, the spreadsheet library couldn’t even read the content. I tried saving it between .xls and .xlsx a few times and the problem persisted. I thought about pasting just the data into a blank spreadsheet to see if that would shake whatever rot had settled into the spreadsheet’s soul, but getting Excel to copy and paste from a spreadsheet that big just made it beachball and crash.

So it was in a state of agitation that I decided to try one last thing: I opened the spreadsheet in Numbers, re-saved it as an Excel file and tried reading that in to my importer script. It not only worked, but the execution time of the script itself went from somewhere around 10 seconds to about 1.5. It was so much faster I rechecked my script, thinking I’d commented out the line that actually saved the record.

I have no idea what goes on inside an Excel file, but if I had to guess, “a lot of crap” would probably be a big part of it. Some of that crap is, no doubt, useful to someone somewhere who’d just die if they couldn’t have this or that feature that’s enabled by this or that bit of crap. I’d also bet that there are people who are bitterly disappointed that Excel spreadsheets they’ve opened or saved in Numbers have quietly stripped out some essential bit of crap they need to do their work. As near as I can tell, Numbers stripped out a bit of crap that kept me from doing mine. It’s almost like discovering that high-compression JPEGS make bad pictures look better.

MetaGames

I like reading MetaFilter. I also like playing the “Guess the Deletion” game on MetaFilter.

Materials Required

  • An RSS reader that updates frequently enough to remember items that appear on MetaFilter but are eventually deleted.

  • MetaFilter members who post lame stuff.

Game Play

  1. Read the MetaFilter RSS feed now and then. Do not click through to read comments until you complete step 2:

  2. When you think you’ve spotted a post that will surely be deleted, signify that by saying to yourself “oh boy,” or “that’s not gonna fly” or “dead.” (Feel free to amend the list of signifying phrases.)

  3. Click through. Look for the little gray box with text that begins: “This post was deleted for the following reason:”

    • Is the little gray box there? Score a point.

    • Is it not?

      • Signify that you’ve doubled down by saying “This isn’t gonna last” or something similar. Two points if it’s eventually deleted. Lose two points if it stays put.

      • Signify that you choose to lose only one point and move on to the next round by saying “Huh.”

Victory Conditions

  • We can’t tell you how to live your life.

Game Variations*

  • Gain 5 points for getting in before actual deletion and taking a big crap all over the thread provided it’s eventually deleted.

  • Lose 5 points for thinking you made it in before actual deletion and taking a big crap all over the thread, only to realize a day later that it hasn’t been deleted.

  • Gain a point for any “favorites” earned in either type of comment.


* Here is where my flat game manual affect goes out the window: It’s pretty rare for me to comment on MetaFilter, let alone engage in daredevilry like I describe here.

The Things Make Us Stupid II

O would some power the giftie gie us to see ourselves as others see us.

– Robert Burns

Re: the Great Tech Purge:

Maybe it’s not so much “tech writing” as it is “things writing.”

I like tech-related things, so a lot of the things I read about will be written about by tech writers. But in the end, whether they’re battery powered, Wi-Fi connected, assessed in terms of their storage capacity or resolution, they’re things.

One thing a lot of those things have in common is that they’re made by really big companies that invest millions and millions of dollars to build affinity with their brands.

One thing brands do is exploit a back door left open by the gap between our images of ourselves, ourselves as we imagine others perceive them, and ourselves as we aspire to be. It’s hard to balance all three sets of perceptions, let alone actually reconcile them. Brands offer an opportunity for relief from that emotional and cognitive stress by suggesting that they can speak for us now and then, relieving us of the burden of being what we wish we were — or what we wish other people would think we are.

© Michael Hall, licensed under a Creative Commons Attribution-ShareAlike 3.0 United States license.