my stuff

what's new with me

Tuesday, August 02, 2005

oh boy.

Just kicked off a Perl script that's going to parse ~100 GB of HTML into a number of different databases (keeping track of both their link and internal structures). When I'm done I'll have a graph of around 200 million nodes that I'll have to run some statistics on.

Wish me luck.

2 Comments:

  • At 9:59 AM, Sid said…

    May the gods of hard drive sector protection and process longevity smile upon you.

     
  • At 5:33 PM, jacob said…

    Thanks for that. It's process longevity that I'm especially worried about. My program will back up its state occasionally, but even so..

     

Post a Comment

<< Home