Tools and Technology I’d Like To Use

This is a short list of technology and tools that I’ve been looking at off and on over the last few months and would like to try out to solve real problems (not just toy projects).

  • RabbitMQ, because queuing make a lot more sense that polling in a lot of applications.
  • 0MQ, because I like the higher level network connection abstractions.
  • Hadoop (probably from Cloudera), because we have a lot of machines and a lot of data and are always Doing It Wrong.  MapReduce and HDFS could simplify A LOT and make things way faster.
  • Redis 2.2, because having the expire semantics that people expect will make some things a lot easier.  2.0 is working great but 2.2 will fix one of its biggest warts, IMO.
  • Apache Mahout, because I’m curious what we could learn if we feed it some of our data.

MongoDB would have been on this list a few months ago, but I’m in the middle of a project using it now.  Come to MongoSV and you can hear about that (of course, I’ll talk about it some here as well).

Since I work in Perl a lot but haven’t kept up on some of what the community has built in the last few years, there are some modules/frameworks that I feel like I should be paying more attention to and trying out:

  • Moose, because it seems to make OO not suck.
  • POE, because I like event-driven stuff in some cases.
  • Coro, because it seems over the top and crazy, but also quite useful
  • Plack, because I’m starting to think we’d be better off ditching Apache/mod_perl since we’re really not using much of Apache (and we’re still on 1.3).
  • AnyEvent, because I’ve played with it some but would really like to do more.

Now, does anyone have some spare time I can borrow?

About Jeremy Zawodny

I'm a software engineer and pilot. I work at craigslist by day, hacking on various bits of back-end software and data systems. As a pilot, I fly Glastar N97BM, Just AirCraft SuperSTOL N119AM, Bonanza N200TE, and high performance gliders in the northern California and Nevada area. I'm also the original author of "High Performance MySQL" published by O'Reilly Media. I still speak at conferences and user groups on occasion.
This entry was posted in Uncategorized. Bookmark the permalink.

5 Responses to Tools and Technology I’d Like To Use

  1. jbmorse says:

    Nice list, I have been considering those top 2 MQ solutions since AMQP seemed to die under its own complexity. Also was thinking about the new Amazon EC2 PHP 5.2 toolkit as well as exercising HTML 5.0 and friends.

  2. Pingback: Always Test with Real Data | Jeremy Zawodny's blog

  3. mrg says:

    i asked the same Qs last year and ended up off the deep end of AnyEvent and have been very rewarded. Forget POE and do plenty of benchmarks on your Moose (or Mouse) for code at scale.

  4. Jeremy,

    Thanks for the Cloudera mention.

    If you have a lot of log data that you need to collect, then you should also kick the tires of Flume, it is a scalable data collection system integrated with Hadoop (though can be used without it too). You can define agents to collect data from any source(s) and dump into any sink(s). It comes with a number of predefined sources and sinks.

    https://docs.cloudera.com/display/DOC/Flume+Installation
    http://archive.cloudera.com/cdh/3/flume/UserGuide.html
    http://github.com/cloudera/flume

    Cheers,

    — amr

  5. insight says:

    I would suggest AnyEvent over POE since it integrates nicely with Coro, taking the pain of writing event-driven applications away (i.e. the callback style which POE obscures somewhat). Also, AnyEvent can be run under any event-loop (even POE).

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s