Python/Python For Analytics

[Python] Pandas iis-log DataFrame 접속자IP 국가식별 컬럼 추가

Pydole 2020. 2. 28. 19:26

 

import geoip2.database
import pandas as pd


reader = geoip2.database.Reader('GeoLite2-City.mmdb')

log_field = ['date', 'time', 's-sitename', 's-computername' , 's-ip' , 'cs-method' , 'cs-uri-stem',
             'cs-uri-query', 's-port' ,'cs-username', 'c-ip', 'cs-version', 'cs-User-Agent', 'cs-Cookie',
             'cs-Referer', 'cs-host', 'sc-status', 'sc-substatus', 'sc-win32-status', 'sc-bytes',
             'cs-bytes', 'time-taken']

df = pd.read_csv(logfile, sep=' ', comment='#', engine='python', names=log_field, encoding='utf-8')



# columns '-' remove

df.columns = df.columns.str.replace('-','')



# client ip - country.iso_code


df['nation'] = df.cip.apply(lambda x : reader.city(x).country.iso_code)
print(df['nation'])


------------------------------------------------------------------------------------


0       KR
1       KR
2       KR
3       KR
4       KR
        ..
2423    KR
2424    KR
2425    KR
2426    KR
2427    KR
Name: nation, Length: 2428, dtype: object