Visiting Frozen Tenaya Lake

Yesterday Kathleen and I visited Tenaya Lake in the Tioga Pass area of Yosemite National Park. The weather has been odd this season. While it’s been quite cold at the higher elevations, there’s been virtually no precipitation yet, so the lake has frozen over but it is still easily accessible.

Tenaya Lake (frozen)

The ice is easily 8 inches or more think in most places, so lots of folks were out wandering around. She wrote up a posting with more pictures on HowToEatAndLive: Walking on Tenaya Lake.

I had a blast sliding around on the lake! At times like this it really pays to live nearby.

Given the rarity of these conditions, we may not be able to do this again for another 10 years or more. And it’s good that we went when we did. The weather is starting to change–headed for a more seasonal mix of cold and rain/snow.

Lots of pictures, including some really cool ice shots she took, are in my Tenaya Lake album on Picasa.

Posted in fun | Leave a comment

Renewed Weight Loss, Health, and a new blog…

Back in mid-2006, I documented my fairly dramatic weight loss in a series of blog postings. Some of you may even remember that. It was on boingboing and all that.

I went from a high of 224 pounds to a low of 164, which made for a total loss of 60 pounds (I’m about 6 feet tall). It’s interesting (and a little disturbing) to go back and re-read what I wrote back then:

Fast forward to January of 2011, and I had gained back about half the weight. I was hovering just below 200 pounds and had needed to purchase larger jeans a few times during the five years in between. Considering the success rate of most diets, that’s not bad but it’s also hardly ideal. Other issues had crept up along the way too.

The story of how I re-lost most of that weight and generally got a lot healthier is the subject of Learning to Eat Well and Stay Healthy, a blog post I just wrote on a new site called HowToEatAndLive that my wife Kathleen and I have started for 2012. It’s going to document a lot of what we’ve learned about health, nutrition, and living in the last year (and interesting things we continue to discover in the new year). We’ve both learned a lot and have tons to write.

If you’re curious to learn about how I lost the weight I gained back and have kept it off, that’s the place to go. It’s a little rough around the edges yet, but we’re working to add some finishing touches as well as a lot more information.

Posted in health | 6 Comments

On Good Enough Code

Reading There’s no shame in code that is simply “good enough” (and the discussion on hacker news), I find myself strongly agreeing. I have this system I built about 3 years ago at craigslist. Since then it has been extended in various ways and now runs on about four times the hardware it did back then. And it handles far, far more data.

In other words, I’m still evolving it (in small steps) and helping it to grow and change to suit our needs. But every now and then, while reading through the code, I get the urge to “improve” it in some way. Often that means adding a new abstraction “just in case” I need in the future. Or there’s that one script that runs from cron, which is written in bash. I’d love to re-write in Perl. But that’d be more work for virtually no benefit.

Why do I not just make these changes? Two reasons:

  1. the stuff works and is stable
  2. every time you change something you may also break something

The fact that I can look back on code I wrote a few years ago and identify ways that I’d do it better is good. It means I’m still learning. But the fact that I can successfully resist the urge to change the code is even better.

The quest for perfection is tricky. You can spend a lot of time chasing it and there’s no telling if you’ll every actually get there. And even if you do, what other things did you put aside in the process?

I think this mentality can apply to a of things in life. One of my hopes for 2012 is that I can better see that.

Posted in Uncategorized | 9 Comments

Christmas on the Beach

Christmas on the Beach

Shot just south of Morro Bay, California.

Image | Posted on by | 2 Comments

Easy and Delicious Turkey Dinner

We had a few friends over for Thanksgiving this year and were lucky enough to split the cooking with them. One of my tasks was making the turkey. We had purchased a pair of 12 pound Diestel Turkeys through our local CSA and planned to use one of them for the dinner.

Looking around for ideas, I stumbled upon Mom’s Roast Turkey on SimplyRecipes and it seemed too easy to be that good. But that couldn’t have been farther from the truth.  It ended up being some of the best turkey we’d every had. The veggies inside were fantastic, the bird was cooked perfectly, and the juices made for amazingly rich and delicious gravy.

2011 Thanksgiving Turkey

We used some lemon infused olive oil on the outside, along with some excellent French grey sea salt, fresh rosemary, and fresh oregano. Inside was some fresh lemon juice, organic carrots, celery, and onions.

I know that pictures really don’t do it justice, but let’s just say that we’ve been eating leftovers for days now and haven’t begun to get sick of them. :-)

In addition to the turkey and gravy, we had some excellent stuffing, garlic mashed potatoes, mashed sweet potatoes with brown sugar, asparagus, and an amazing cranberry/apple/orange sauce.

Pictures are here: Thanksgiving 2011 on Picasa.

Posted in cooking, food | 1 Comment

Looking forward to Redis 2.6

In reading Short term Redis plans, I’m happy to see the “More introspection” section. For a long time some in the Redis community have asked for the ability to publish key names to a channel when they expire. And, while I sympathize with their desire for such a feature, I also realize that it’s not the greatest solution to the problem (since pub/sub is best effort–a client could be disconnected for a bit and miss messages).

But it looks like Salvatore is taking things a step or two farther…

There is a plan to use Pub/Sub in order to communicate events happening inside Redis, like a key that expired, clients connecting / disconnecting, operations performed against keys. We’ll probably allow the user to script this feature with Lua so that you can, for instance, push all the keys expired inside a list as well, or other things that can’t be reliably done with clients and Pub/Sub since the client is not guaranteed to get all the messages (it can get disconnected for some reason).

This strikes me as really good. He’s been listening to feature requests for a long time. Some appear and vanish after a short time, while others persist. This has been a persistent request for a long time now. Building it in a way that allows for robust notification ought to make everyone happy.

I’ve personally not allowed myself to design systems that would require knowing when a key expires, but seeing this on the roadmap really does open up a lot of possibilities for future work.

Posted in programming, redis, tech | 1 Comment

Experimenting with Real-Time Search using Sphinx

In the last few weeks I’ve been experimenting with using real-time indexes in Sphinx to allow full-text searching of data recently added to craigslist. While this posting is not intended to be a comprehensive guide to what I’ve implemented (that may come later in the form of a few posts and/or a conference talk) or a tutorial for doing it yourself, I have had a few questions about it and learned a few things along the way.

Implementation

I’m building a “short-term” index of recent postings. The goal is to always have a full day’s worth of postings searchable–possibly more, but a single day is the minimum. To do this, I’m using a simple system that looks like a circular buffer of sorts. Under the hood, there are three indexes:

  • postings_0
  • postings_1
  • postings_2

My code looks at the unix timestamp of the date/time when it was posted (we call this posted_date), run that thru localtime(), and use the day-of-year (field #7). Then I divide that value by the number of indexes (3 in this case) and use the remainder to decide which index it goes into.

In Perl pseudocode, that looks like this:

    my $doy = (localtime($posting->{posted_date}))[7];
    my $index_num = $doy % $num_indexes;
    return "postings_" . $index_num;

That means, at any given time, one index is actively being written to (it contains today’s postings), one is full (containing yesterday’s postings), and one is empty (it will contain tomorrow’s postings starting tomorrow).

The only required maintenance then is to do a daily purge of any old data in the index that we’ll be writing to tomorrow. More on that below…

The final bit is a “virtual index” definition in the sphinx configuration file that allows us to query a single index and have it automatically expand to all the indexes behind the scenes.

index postings
{
    type  = distributed
    local = postings_0
    local = postings_1
    local = postings_2
}

That allows the client code to remain largely ignorant of the back-end implementation.  (See dist_threads, the New Right Way to use many cores).

Performance

Even on older (3+ years) hardware, I find that using bulk inserts, I can index around 1,500-1,800 reasonably sized postings per second. This is using postings of a few KB in size, on average. But some categories often contain far heavier ads (think real estate). And some are much smaller.

Query performance is quite good as well. As with any full-text system, the performance will depend on a lot of factors (query complexity, data size vs. RAM size, index partitioning, number of matches returned). But for common queries like “find all the postings by a given user in the last couple of days” I’m seeing response times in the range of 5-30ms most of the time. That’s quite acceptable–especially on older hardware.

I haven’t yet performed any load testing using lots of concurrent clients, more complex queries, or edge cases that find many matches per query. That will certainly be a fun exercise.

Limitations

If you’re curious about using Sphinx’s implementation of real-time indexes, there are a few limitations to be aware of (which are not well documented at this point). The first thing I bumped into is the rt_mem_limit configuration directive. That tells sphinx, on a per-index basis, how large the RAM chunks (in-memory index structures) should be. Once a RAM chunk reaches that side it is written to disk and transformed into a disk chunk. Unfortunately, rt_mem_limit is represented internally as a 32bit integer, so you cannot have RAM chunks that are, say, 8GB in size.

The hardware I’m deploying on has 72GB of RAM and I’d like to keep as much in memory as possible. Since we’re a paying support customer with some consulting (feature development) hours available, I’ve asked to put this on the list of features we’d like developed.

Secondly, in order to complete this “round-robin” style indexing system, I need to efficiently remove all postings from tomorrow’s index once a day. Currently the only way I can do that is to shut down sphinx, remove the relevant chunks, and then start it back up. There’s no TRUNCATE INDEX command (think of MySQL’s TRUNCATE TABLE command). This current at the top of our feature wish list.

The final issue that I ran into is that there’s currently no built-in replication of indexes from server to server. That’s not a big issue, really. It’s just different than our master/slave implementation of “classic” (batch updating, disk-based) Sphinx search that I built a few years ago.

Reliability

I’m happy to report that I’ve not found a way to crash it in normal use. When I first made a serious attempt at using it last year, that was not the case. I filed a few bugs and they got fixed. But now, as far as I can tell, it “just works.”

Having said that, it’s a good idea to have a support contract with Sphinx if it is mission critical to your business (we handle hundreds of millions of searches daily) and there are some features you’d like to see built in. We’ve been happy customers for years and I personally have found them easy to work with and understanding of our needs.

See Also

Here are some old blog postings and presentations that I’ve published about our use of Sphinx at craigslist.

I’ll try to write more about or use of real-time indexes as my prototype moves into production and we get a chance to learn more from that experience. In the meantime, feel free to ask questions here.

Posted in craigslist, sphinx, tech | 11 Comments

Fighting Snakes Caught on Video

We spend just under a week in Tucson, Arizona for some required airplane maintenance recently. While there, we decided to visit the Saguaro National Park. While there we did a hike that took us along a dry riverbed where we encountered a pair of Western Diamondback Rattlesnakes fighting for dominance.

Both of us tried to shoot some video, but I was too dumb to use my camera properly. Thankfully, the Canon SD800 that Kathleen was carrying worked just fine. Here’s the video.

The park rangers were very excited when we returned with copies of that video. Apparently it’s not that common to see a pair of snakes dueling out in the open like that.

Posted in other | 7 Comments

Parting Advice from Steve Jobs

It’s hardly news at this point that Steve Jobs died today. And like many folks, I’m rather shocked by how quickly this happened, considering how recently he stepped down from his role as CEO of Apple.

Though it’s incredibly sad news and reading about it doesn’t help make it any easier (except for those rare people who are revealing something truly new and insightful about how they knew Steve), I find a lot of comfort and wisdom in the 2005 Stanford Commencement Address he gave.

If you’ve never heard or seen it before, take a few minutes and do so now. It’s well worth your time. Of the many things he says in that brief speech, here’s something that has always stuck with me.

You’ve got to find what you love!

The only way to be truly satisfied is to do what you believe is great work. And the only way to do great work is to love what you do.

If you haven’t found it yet, keep looking–and don’t settle.

I couldn’t have said it better myself.

Steve Jobs is gone. But it is my sincere hope that the passion and inspiration that has spread to so many people over the years will live on and continue to impact the world for years to come.

Posted in other | 4 Comments

I’m Bad For Business!

Apparently Kevin Drum of Mother Jones thinks so. Check out the article Overpaid in D.C. where I’m featured in a couple of fake government ID badges.

as seen in mother jones

Needless to say, arriving at the office today and sharing that link, I was greeted with more than the normal amount of laughter.

Seriously, MJ, how about a link or attribution at least?

UPDATE: here is the original source

Posted in wtf | 5 Comments