import geoip2.database
import pandas as pd
reader = geoip2.database.Reader('GeoLite2-City.mmdb')
log_field = ['date', 'time', 's-sitename', 's-computername' , 's-ip' , 'cs-method' , 'cs-uri-stem',
'cs-uri-query', 's-port' ,'cs-username', 'c-ip', 'cs-version', 'cs-User-Agent', 'cs-Cookie',
'cs-Referer', 'cs-host', 'sc-status', 'sc-substatus', 'sc-win32-status', 'sc-bytes',
'cs-bytes', 'time-taken']
df = pd.read_csv(logfile, sep=' ', comment='#', engine='python', names=log_field, encoding='utf-8')
# columns '-' remove
df.columns = df.columns.str.replace('-','')
# client ip - country.iso_code
df['nation'] = df.cip.apply(lambda x : reader.city(x).country.iso_code)
print(df['nation'])
------------------------------------------------------------------------------------
0 KR
1 KR
2 KR
3 KR
4 KR
..
2423 KR
2424 KR
2425 KR
2426 KR
2427 KR
Name: nation, Length: 2428, dtype: object
'Python > Python For Analytics' 카테고리의 다른 글
[Python] matplotlib - lollipop graph (0) | 2020.07.15 |
---|---|
[Python] matplotlib - pie graph (0) | 2020.07.09 |
[Python] matplotlib - angle line graph (0) | 2020.07.08 |
[Python] pandas 중복값 처리 (duplicates, drop_duplicates) (0) | 2020.04.08 |
[Python] split을 이용하여 pandas 컬럼 분리하기 (0) | 2020.02.25 |
[Python] replacement of Pandas dataframe NaN value (0) | 2020.02.16 |
[Python] padnas dataframe URL Decode (2) | 2020.02.16 |
[Python] numpy.where 를 이용하여 컬럼을 다양한 데이터 타입과 비교 (0) | 2020.01.21 |