月度归档:2015年05月

Python 日志统计 一例[@]

@http://blog.aoath.com/archives/674.html
源文件:

file name: abc

218.79.251.215 - - [23/May/2006:08:57:44 +0800] "GET /fg172.exe HTTP/1.1" 206 2350253
220.178.150.3 - - [23/May/2006:08:57:40 +0800] "GET /fg172.exe HTTP/1.1" 200 2350253
59.42.2.185 - - [23/May/2006:08:57:52 +0800] "GET /fg172.exe HTTP/1.1" 200 2350253
219.140.190.130 - - [23/May/2006:08:57:59 +0800] "GET /fg172.exe HTTP/1.1" 200 2350253
221.228.143.52 - - [23/May/2006:08:58:08 +0800] "GET /fg172.exe HTTP/1.1" 206 719996
221.228.143.52 - - [23/May/2006:08:58:08 +0800] "GET /fg172.exe HTTP/1.1" 206 713242
221.228.143.52 - - [23/May/2006:08:58:09 +0800] "GET /fg172.exe HTTP/1.1" 206 1200250

示例:

#!/usr/bin/env python

sum = []
with open('abc', 'r') as file:
    for i in file.readlines():
        ip = [ i.split()[0]]
        if ip not in sum:
            sum.append(ip)
    print len(sum)

# End

输出结果:

5

## ip 去重,并统计 ip 数。

示例 2:( 优化版 )

#!/usr/bin/env python

a={}
with open('access_log', 'r') as file:
    for i in file.readlines():
        ip = i.split()[0]
        try:
            a[ '%s' % ip ] = a[ '%s' % ip ] + 1
        except:
            a[ '%s' % ip ] = 1
print len(a)

# End

## 速度是上一个版本的 7 倍之多 !