Network size

Written by dcat on 25.05.2009 | General

The network size is now featured on the front page once again. When the new crawler was implemented that statistic had to be dropped because it was too resource intensive to calculate with how the new crawler worked. But now that issue has been resolved.
Some background
The number of leaves on the network isn’t a good [...]

A Quick Update

Written by dcat on 12.02.2009 | General

I haven’t made a post in awhile so I thought I should.
Not much is going on with the crawler right now. I’ve been pretty busy lately and haven’t had any time to spend on improving the crawler. However there were a few subtle updates to many of the webpages. More detailed descriptions were added to [...]

The Architecture of a Crawler

Written by dcat on 01.11.2008 | General

I’m going to explain how crawlers work. There are three main tasks that a crawler has to take care of.

Find new hosts to crawl.
Request data from a host that is being crawled.
Display to the user the data gathered.

This design lends itself well to being distributed. Several host crawlers (those that perform task 2) can all [...]

Recent Updates

Written by dcat on 19.10.2008 | General

My focus lately has been on hub uptimes. There is a new page showing hub uptime distribution graphs. It gives a visual representation of some of the categories on the uptimes page. The overall hub uptime distribution graph also features two vertical lines. The red line shows where the average hub uptime is and the [...]

Links

Light Reading

  • Blogroll