This article gives a rough introduction to the semantic web, microformats and how to extend a site functionality using greasemonkey, a Firefox browser extension and a custom program called a user script.
The limited Web
First generation web sites tend to be pretty closed by nature. They display information in a particular way, do not often let you access the underlying data in an easy to parse format (such as XML), and rarely let you interact with the data using APIs. The functionality is limited by what the designers of the site thought of. That sometimes become frustrating.
Sending emails to sites with proposed improvements rarely get you an answer, let alone a change that matches your request.
Let’s take an example
Finn.no is (today) such a first generation web site. Let’s look at its real estate part. As finn.no is a proeminent medium for publishing real estate listings in Norway, it contains a lot of entries, which is good for the purpose of this articles (and for the users of the site). When you load a page, e.g. this one, you get a 2 column page with formated information about the apartment or house to buy, including the address, the contact information. The page also sometimes the date and time for the various showings, called visning is Norwegian:

If you are interested by the housing, you might want to take note of the showings into your calendar. If you are like me, your calendar is on the web, and is just one tab away. But entering the information in the calendar is error prone, time consuming and not necessarily consistent every time you do it. You might forget some data, or make a mistake. There has to be a better way.
User scripts
Not all is lost! As the information is publicized over the Net, if you have the will, you can find a way to interact with that data and extend the original functionality of the web site.
One of the easiest and most user friendly way is to extend your browser with an extension that lets you use something called user scripts. User scripts are programs written in javascript, the language supported by most browsers that allow you to dynamically modify the DOM, the data representation of web pages. If a user script is run by one such particular browser extension, it will then have a chance to interact with the page, modify it and even interact with other pages to retrieve or combine information.
On the Firefox web browser, an increasingly popular web browser for which thousands of extensions exist, there are at least 2 extensions that allow you to use and manage user scripts: operator and greasemonkey. This article makes use of the latter and assumes your have it installed in your browser.
Microformats
If you know about the web, you may have heard about the semantic web. The w3c defines it as a common framework that allows data to be shared and reused across application, enterprise, and community boundaries. Part of this effort are microformats, which allow you to use markup data to tag reusable information inside a web page.
Such an example of microformat is hcalendar which allow you to embed an event info a page. There already exist user scripts that can detect hcalendar entries in a web page and allow you to publish them on your calendar. If you use Google Calendar, you can already find a Greasemonkey user script that adds button(s) to the current page that allow you to easily register using a simple click the hcalendar events into your Google calendar.
Finn.no realestate hcalendar events
Unfortunately not many sites use microformat yet, and finn.no does not. But if I create a small user script that formats the showings as a hcalendar events, I can then reuse the more generic publish hcalendar to Google Calendar script.
So here we go. Provided that you have installed the Firefox greasemonkey extension (I used 0.7), install the finn.no user script, then install the modified hcalendar publishing script. Then surf on finn.no/realestate.
After the run of the finn.no user script

After the run of the hcalendar publisher user script

Now just press the button corresponding to your event! Finn.no now integrates easily with Google Calendar, provided you use greasemonkey and install the appropriate user scripts.
Bonus: as Google Map now covers Norway, you can also place the address on the map (provided that the address information published in the originating site is recognized by Google).
Notes
- I’ve had to modify the original hcalendar publisher script as I had problems with timezones.
- There are modified/improved version of the original hcalendar publisher script on the web, in particular from Patrick Chanezon of Google (and ossgtp) fame. I’ve chosen to use the original user script.
- I’ve also had encoding issues in the add event page to google calendar. The problem may come from the button url generated publishing script instead, but I’ve chosen to to keep the differences minimal with the original script. Thus the generated hcalendar contains encoded data…
Feel free to report any issues.
Conclusion
This article introduced the passioning future of the web, the Semantic web and web sites integration.
Although the technique used is not perfect, in particular it isn’t necessarily very stable (if the targeted site changes, your script might be broken), it is a very efficient way to create a prototype for changing or integration one or more sites.
Note: as of this writing, I have no relation with the site used for this article, apart from being a more or less happy user (happier now that I have this script).