Monitor stories from your website appearing on digg.com

Blogging, Google 1 Comment »


I’ve been looking for a way to easily monitor stories from my website that appear on digg.com so that I can prepare if a story looks like it might reach the front page. My first solution was to simply subscribe to the RSS search results feed for my domain. Doing this is pretty simple: search for your domain using digg’s search utility, specifying “URL only” as the search criteria. Then subscribe to the RSS feed on the results page.

I’ve added the resulting RSS feed to my Google homepage, which allows me to keep tabs on upcoming stories from my site. But there are two problems with this method: it doesn’t tell you how many diggs stories have, and Google caches the feed results so you’re looking at old data.

After my recent foray into google gadgets I realized that a simple google gadget could solve both problems, so I wrote one. It’s based on the official Digg gadget with one additional option — monitoring stories for a specific website. If you want to try it out, you can click here to add it to your google homepage.

Digg Widget 2.0

Blogging, Programming 2 Comments »

DiggFriends 0.02
A couple of weeks ago I wrote a post on the Vino2Vino development blog about a digg widget that displayed your friends’ digg activity on your blog. I wrote the widget in response to a request by another digg user. If you didn’t read the original post, the widget displays a list of articles recently dugg by your friends. As I said in the original post, the code was a simple proof of concept, but it drew a lot of attention and a lot of criticism.

The complaints were all pretty accurate (no built-in caching, scraping data instead of using RSS, poor styling, etc.), and I promised to address them in a future release of the widget. So here it is — DiggFriends 0.02 Beta. You can see a working example of the new version in the sidebar here on my blog.

Read the rest of this entry »

How to make a Google Gadget in 15 minutes or less

Google, Web Development 26 Comments »


A Google Gadget is a small XML file that generates a widget on a Google Personalized Homepage. Google has excellent documentation describing how to make a Gadget, but it’s so verbose that it hides just how simple it is to make your own Gadget, especially if you already have a widget or feed on your website that you’d like to Gadgetize (TM). It’s really, really easy! And it can generate a ton of traffic for your site.

Read the rest of this entry »

RobotReplay - Watch your users interact with your website

Uncategorized 1 Comment »

RobotReplay just launched tonight at the Web2.0 Expo. It lets you record your users’ browsing sessions and play them back later - they call it ‘cinelytics’. Membership is open to the public (unlike some similar services) and can be setup in seconds by adding a single script tag to your site’s HTML.

read more | digg story

Live from the Web2.0 Expo

Uncategorized No Comments »

Streaming video from the Web 2.0 Expo in San Francisco… until my laptop battery runs out.

Network Engineering in a Web 2.0 World

Uncategorized 1 Comment »

I just found an interesting blog post titled Web 2.0 & Death of the Network Engineer. The argument is that Network Engineers are no longer relevant to today’s Internet because the complexity of underlying network infrastructure and basic network services has been abstracted away.

There’s something to this argument when you analyze ordinary web startups that are doing the same thing everybody else is doing. If you’re not doing anything unique, you probably won’t need to go any lower than the HTTP protocol to build your application. But the most interesting startups today (and always) are the ones that are pushing the envelope. Companies like Joost (TV over the Internet), dash (sat-nav with realtime traffic information), Jangl (phone call anonymizer), and sling all require low level network engineering to do right. There are even companies like Aspera that have built their entire business around innovative network technologies (they provide file transfer technologies for many popular software packages, e.g. iTunes).

One could argue that all of these companies are doing things outside of the typical HTTP client/server environment of the web, which is why they require special network technologies. But isn’t breaking out of the normal client/server web environment the whole point of “Web 2.0?”

Measuring the Digg Effect

Blogging No Comments »

So I got dugg earlier today, neat. I woke up at around 9am (EST) and saw that my post on how not to optimize a MySQL query had just made the front page of Digg. Cool. I ran tail -f on my apache logs and saw lots of traffic. Even cooler. I ran top and saw the load average was around 50 (that’s a five followed by a zero, on a single processor box). Not so cool. I immediately got up and took a shower, cuz I can’t do jack without a shower. After showering I installed WP Cache.

As you can see from the MRTG graphs below it made a bit of difference (note that the load is multiplied by 100 in the graph). Before WP Cache the load average was bouncing around in the 30 to 70 range. The site was responding, but very s - l - o - w - l - y. It hadn’t been ‘dugg’, but it was close. Tip: having a dedicated server (or two in this case) helps here. After WP Cache the web server load dropped to about 0.02. Latency dropped to near zero. Beautiful.

You can see the difference it made in the network utilization data. The green is incoming traffic, blue is outgoing. The green spike is the incoming SQL result sets coming from the database server. After WP Cache was installed it (predictably) dropped to pre-digg levels. Curiously, the database server never flinched. Apparently WP is not that database intensive.

I’ve also posted a number of graphs from Google Analytics, in case anyone was curious. I like these numbers. 73% Firefox vs. 14% IE. 15% Mac, 10% Linux. Sweet. There have been a total of 17,175 visits so far, the peak was 5,421 visitors from 9am to 10am. Note that some of the times may need adjustment since Google Analytics is PST and the server is Central.

If you’re interested in some statistic that I forgot to mention, you can download a copy of the log file (gzip format) (bzip2 version) for analysis. It’s 2.1MB using bzip2, 4MB gzipped, 125MB unzipped. If you find anything interesting post a comment and let me know. If you can’t figure out how to calculate whatever you’re looking for post a comment anyways and I’ll see if I can help you (or maybe someone else can).

Load Average

April 10 - Digg Effect Load Average

CPU Usage

April 10 - Digg Effect CPU Usage

Network Utilization

April 10 - Digg Effect Network Utilization

Traffic Stats

April 10 - Digg Effect Google Analytics

Browser Version

April 10 - Digg Effect Browser Version

Platform

April 10 - Digg Effect Platform

Impress your friends with your blog stats

Blogging, Programming 78 Comments »

I was chatting with my buddy ryan earlier about PaulStamatiou.com and noticed the little word/comment counter he has in the header of his blog. I thought it was cool, so I wrote a wordpress plugin to generate the same stats for my blog! Now I’m making it available to you. I’m calling the plugin Impress, and you can download it here (pretty-printed version).

Here’s how you install it:

  1. Save the file to your wordpress plugins directory (wp-content/plugins) as impress.php
  2. Activate the plugin under the plugins tab on the wordpress admin panel (it should be the one called Impress)
  3. Place a call to the impress() function in your header.php, footer.php or some other template file - it should look like <?php impress(<format>); ?>

That’s all!

The impress function takes one argument: a specially formatted string that determines what the output will be. There are six special keywords that will be replaced with your statistics, the rest of the string can be anything (it will be displayed along with the stats). The keywords are :users, :posts, :pages, :comments, :categories, :post_wordcount, :page_wordcount, and :comment_wordcount. They’re pretty self explanatory, so I’ll let you figure out what they mean.

Here’s an example from my blog (see the lower right-hand corner):

<p><?php impress("So far I've written :post_wordcount words
in :posts posts. :comments comments have been posted,
with a total of :comment_wordcount words."); ?><p>

Let me know if you have any thoughts, comments, suggestions, etc. Otherwise, enjoy!

Update: It’s fast, too! I’m being dugg right now and the server’s not breaking a sweat.

Jason Mraz’s Latest Hit: Tacos & Mojitos

Uncategorized 2 Comments »

This is just a little clip I’ve been meaning to put online for a while now from a Jason Mraz concert at Virginia Tech. Mraz came up with the song after gathering suggestions for lyrics from the audience for about 10 minutes. The result was pretty amusing.

You can download the song (along with the rest of the albumn) from archive.org.

How not to optimize a MySQL query

Database, MySQL, Programming, SQL 32 Comments »

I just read a blog post discussing mysql query optimization and thought I’d put in my two cents.

The post suggests using a number of mysql specific statements (e.g. SQL_SMALL_RESULT, HIGH/LOW_PRIORITY, and INSERT DELAYED. STRAIGHT_JOIN was conspicuously missing). Unless absolutely necessary, this is usually A Bad Idea for at least two reasons. First, they are specific to MySQL which makes your database code less portable. This might or might not be a problem. Second, and perhaps more importantly, giving the SQL interpreter this sort of hint can lead to decreased performance in the future when your database or the interpreter changes. Telling the interpreter to anticipate a small result set (with SQL_SMALL_RESULT) might seem like a good idea, but could lead to problems when your table grows and the result becomes large! Basically, use these keywords with caution, and only when you really need them. And when you do use them, take special care in documenting where and why they’re in use.

The truth is there is no silver bullet that is going to make MySQL (or any dbms) run a poorly written query lightning fast. But here are some tips that the post somehow neglected to mention.

Properly index your tables

If you do a lot of lookups using a particular column of a table, or if you join on a column, that column should be indexed. Moreover, if all of the data that you are retrieving is available in the index (e.g. you’re using a multi-column index) then MySQL can avoid looking at the table altogether and execute your query using just the index.

Avoid superfluous queries

Don’t do this:

$result = query_db('select * from table1');

for each $result as $row
  $array[] = query_db('select * from table2 where column = '.$row['id']);
endforeach;

Do this:

$result = query_db('select table2.* from '
       .'table1, table2 where table1.id=table2.column');

Look for bottlenecks

Don’t waste time optimizing queries that aren’t bottlenecks in your application. Find the low hanging fruit and correct those problems first.

Learn SQL

This is the most important tip. SQL optimization really has to be done on a case by case basis, and you can’t do it unless you have a good understanding of the language and how you can use it to your advantage. You need to understand things like subqueries, grouping, left joins vs. right joins vs. full joins, etc. There is no free lunch.

If you’re interested in learning more, I highly recommend Stephane Faroult’s book The Art of SQL.

Copyright © 2007 - Mike Malone / Icons by N.Design Studio
Entries RSS Comments RSS Log in
no image