Thursday, 15 December 2011

Confessions: FSDB

Confessions: FSDB:

Robert Horvath shares with us a confession from a system a long, long time ago.




When I was twelve years old, I wrote my first point-and-shoot game on my Commodore 64. The next year, I burned-out that very same C64 by trying use its sound chip for placing phone calls. At fourteen, I taught myself Pascal and wrote a little program that hid itself in memory and snooped on my classmates’ and teachers’ passwords as they typed them in. There was no question about it. Before I even entered high school, I was a bona fide computer genius.


By the time I graduated high school, I was ready to take on the world – or, more specifically, the World Wide Web. With my technical acumen, I had no problem finding a job as a “webmaster” at a local marketing company. My job mainly consisted of creating and copying static HTML pages on different servers for different websites, but every now a client needed something a little more exciting. And, as obviously the most qualified guy there, I always volunteered.


One client – a fairly large print magazine – wanted to not only bring their content online, but maintain it themselves. With my two months of HTML experience, I was just the person to implement a content management system!


After spending a few hours googling on AltaVista, I quickly became an expert on CGI. I then picked up a copy of “Teach yourself C/C++ in 21 days”, read the first few chapters, and was ready to write my very first CGI script, or “web application” as you’d call it today.


First things first, I re-invented and re-implemented the request parsing wheel. That took a few solid weeks to do, and another couple to debug until it became stable. And then I wrote a simple, file-based dynamic content management system: HTML files were created from the “create article” form contents, saved to the disk with a date-based file name, and the appropriate index.html files were updated to link to the file name. As I was wrapping things up, my boss delivered some great news: the client now wanted a site search feature for the site.


Back then, I hadn’t even heard the word “database” before, let alone knew how to use one. So, I needed to get clever. I figured, a site search should return all articles that matched a particular word and it should also allow for wildcard-like searching. Obviously, opening and searching each file would be would be incredibly slow (especially with wildcards), so I needed to think of something else.


Then it hit me: doing a ls with wildcards was basically instantaneous, so if I could just use that, my search would be instantaneous as well. The algorithm I came up with was pretty simple.



  • When an article is uploaded, split it into individual words.

  • For each word, check if a file exists on disk with that name, and add it if not.

  • Append the article’s file name to the word file name.


As you might image, the search involved little more than an ls *search_word* and a few more file-opens to display the articles on the page. And it worked about as well as you might imagine. I left that company long before I ever had to maintain my mess, but I still shiver when trying to imagine the WTF-moment of my follower.






No comments:

Post a Comment