Q & A on starting a weblog
Comments were broken, and possibly something else too ..

On using unique IPs as a measure of the amount of readers

(Not that I'd expect anyone (except me, of course) to do such a silly thing, but anyhow.)

:~/logs$ grep python access.log | cut -d ' ' -f 1
 | sort | uniq | wc -l
 1850
 

:~/logs$ grep python access.log | cut -d ' ' -f 1
 | sort | uniq | grep inktomisearch | wc -l
 820
 

For the record:

:~/logs$ tail -1 access.log | cut -d '[' -f 2 | cut -d ']' -f 1
12/Feb/2005:19:20:49 +0200

:~/logs$ head -1 access.log | cut -d '[' -f 2 | cut -d ']' -f 1
06/Feb/2005:06:52:37 +0200

[1 comment]


Comments:

Posted by Jarno Virtanen (On behalf of David Mertz) at 14.02.2005, 07:35

(I, Jarno, am posting this on behalf of David Mertz, who couldn't comment on it because some errors resulted)

--

That's exactly what I do as well. I can't really think of a better proxy for number of readers than number of IP addresses. I know it's not precise, but it varies in both directions: some (many) readers have changing IPs, but also many readers go through a common cache/proxy, so look like just one address.

In fact, I've been watching this for a number of years (my host purges logs weekly, keeping only summaries), but I run a script to extract IPs on a regular basis (more than weekly, but slightly stochastic, so I can miss a couple IPs here and there).

You can see it at:

http://gnosis.cx/cgi-bin/bug2.cgi

I.e. 500k unique IP addresses, which seems pretty good to me. The script that does it is:

$ cat bug2.cgi
#!/bin/bash
cat /www/logs/gnosis-access-log all_visitors awk '{print $1}'
sort uniq > new_all
mv new_all all_visitors
echo "Content-type: text/plain"
echo
echo "Total: " `wc all_visitors cut -b -7` "(since Nov 21, 2002)"
echo "===================================="