Do you want to Organize your Programs menu or setup your Quick Launch Tool Bar? Find out how by watching this screencast.

Linux Counter

Mailing List

To join my mailing list,
enter your email address
and press Subscribe:

My Musical Taste

RSS News Feeds

Short Link Bot Print
User Rating: / 0
PoorBest 

A few months ago I started using a service called Bit.Ly - a Short Link service. Such services can be generally useful but become somewhat mandatory for Twitter users. (If you haven't tried Twitter yet you are missing something interesting! Strange bird, though, as the maximum length of a Micro Blog article is 140 characters!)

In general a Short Link service should not be necessary for somebody who has access to an Apache server as I do. It's just as easy to setup a .htaccess file and map short names to any URL as it is to use a Short Link service. In practice, though, the Bit.ly service turned out to be easy to use and quite useful. So I started using it.

Everything was fine until I noticed that bit.ly was reporting, for example, twenty clicks when my web server stats showed something like eighty views. So I was wondering: "What's up?"

There's one other issue that came up: I was using the bit.ly service too much! Within a few weeks I had accumulated over 100 links and was no longer able to make any sense out of it. Bit.ly doesn't offer an XML download option and their user interface doesn't allow for any kind of association between a link and anything else.

So I wrote my own short link library. The initial version, 0.001, went into production on the 10th of July, 2009.

As part of the process my short link code tries to get the Title of every page it references and, if possible, some additional information such as keywords. For this to work I must GET the target URL and scan for the <title> and other tags.

Before doing that I set the User Agent to refer to this article.

Getting back to my story:

As you can imagine my short link library is not designed to simply redirect from a short link to a longer one. It is intended to help me keep track of what I am doing and, as much as possible, report on the results of a given activity.

You'll recall that I was wondering why bit.ly was showing, for example, twenty clicks when my web server stats were showing as many as eighty. It turns out that the moment a Short Link appears in the twitter time-line it gets logged by many, many robots. These robots all execute a quick scan of the target page.

My guess is that bit.ly doesn't count bot hits but does politely redirect them - which results in inflated numbers on my web server. Also I realize that humans who read my web pages probably refresh the page or click around a bit - which also inflates the number of views (I'm mostly using Joomla; it's not designed to provide useful view stats.) Finally my sites do have regular viewers who provide traffic that a short link service will not be able to track. This, again, increases the number of page views.

I have no idea what the bots are doing with the page text. In the case of Facebook I imagine they probably do a virus scan of some sort. Personally I plan to start scanning my list of followers in an attempt to identify spambots - so I guess the basic idea of reading a page which is referenced on Twitter can be useful.

If you are reading this because you followed the link in the UserAgent field in your web stats I'd like to hear from you! Please take a moment to Contact Me if you have any questions, comments or concerns. I will be glad to help if there is anything I can do for you.

Thanks!

p.s. The code is rather simplistic and the user interface is a little useless just now. It will be a while before this project is worth getting excited about.

 
< Prev   Next >

Login Form






Lost Password?
No account yet? Register

Radio Station

News

AzerTech.net News Feed

The Software Development Cycle
When BNT Solutions (http://www.bnt-solutions.net/) was incorporated in 2001 there was one goal in particular that ran as a thread through...



Introduction to Networking
The very first time I tried to setup a server at home I discovered something unexpected: What I thought was going to...



Subscribing to a Mailman list from any web page
It happens quite often that you want people to be able to subscribe to your Mailman mailing list from a...


Montreal Family

Time for Canadians to speak!
The Government of Canada has begun a nationwide series of consultations on copyright modernization.



Linked-In for business and pleasure
The Linked-In crowd will try to convince you that their service is good for business. This is more...



A sad day for free speech and copyright law
Here is some important, well written commentary from William Patry and, in response, Pamela Jones of Groklaw. First, illiam Patry explains why he...


Favorite Music

  Artists     Albums     Recent Tracks