'Python/Python For Analytics' 카테고리의 글 목록 (2 Page)

Python/Python For Analytics 33

1. 기본 파이 그래프 그리기 import matplotlib.pyplot as plt plt.pie([10,20,30,40,50]) # 리스트의 값을 더한 후 값의 크기에 자동으로 비율조정 plt.show() 2. 라벨넣기 plt.axis('equal') label = ['A','B','C','D','E'] plt.pie([10,20,30,40,50], labels=label) # labels명 지정, 값과 라벨의 길이가 같아야 한다 plt.legend() plt.show() 3. 비율 표시 넣기 plt.axis('equal') # 크기를 일정하게 조정 label = ['A','B','C','D','E'] plt.pie([10,20,30,40,50], labels=label, autopct='%.1f%..

Python/Python For Analytics 2020.07.09

[Python] matplotlib - angle line graph

1. 기본 꺽은선 그래프 그리기 import matplotlib.pyplot as plt import random y = [] for _ in range(10): y.append(random.randint(1,100)) plt.plot(y) # 기본적으로 y축으로 설정 plt.show() 2. x축 범위 지정하기 plt.plot(range(10),y) # x축을 range(10) 지정 plt.show() * x축과 y축이 길이가 맞지 않으면 에러발생 "ValueError: x and y must have same first dimension, but have shapes ( ) and ( )" 3. 제목과 레이블 넣기 import matplotlib.pyplot as plt import random y1..

Python/Python For Analytics 2020.07.08

[Python] pandas 중복값 처리 (duplicates, drop_duplicates)

데이터 분석을 하다보면 특정 컬럼의 중복값을 제거해야 할 때가 있는데, pandas의 duplicates, drop_duplicates 메소드를 사용할 수 있다. duplicates( [ 'column' ], keep='first | last | False' ) : [ 'column' ] 에 대해서 중복이 있는지 확인 drop_duplicates( ['column'] , keep='first | last | False') : [ 'column' ] 중복값 처리 예제) 일자별 품목에 대한 금액 변동 DataFrame product = [['2020-01-01','T10001', 20000, 'BLACK'], ['2020-01-01','S10001', 10000, 'WHITE'], ['2020-01-01',..

Python/Python For Analytics 2020.04.08

[Python] Pandas iis-log DataFrame 접속자IP 국가식별 컬럼 추가

import geoip2.database import pandas as pd reader = geoip2.database.Reader('GeoLite2-City.mmdb') log_field = ['date', 'time', 's-sitename', 's-computername' , 's-ip' , 'cs-method' , 'cs-uri-stem', 'cs-uri-query', 's-port' ,'cs-username', 'c-ip', 'cs-version', 'cs-User-Agent', 'cs-Cookie', 'cs-Referer', 'cs-host', 'sc-status', 'sc-substatus', 'sc-win32-status', 'sc-bytes', 'cs-bytes', 'time-taken..

Python/Python For Analytics 2020.02.28

[Python] split을 이용하여 pandas 컬럼 분리하기

import pandas as pd lst_A = ['banana 3','apple 5','orange 6','mango 7'] df = pd.DataFrame(lst_A, columns=['text']) df df['fruit'] = df.text.str.split(' ').str[0] df['count'] = df.text.str.split(' ').str[1] df

Python/Python For Analytics 2020.02.25

[Python] replacement of Pandas dataframe NaN value

fillna()은 "NaN" 값만을 변환. "NaN" 값만 처리할 경우 fillna()를 쓰면 되겠다. fillna()를 이용하여 "NaN" 값을 0 (Zero) 으로 대체 import pandas as pd import numpy as np list_A = [1, 2, 3, 4, np.nan, 6, 0 ] df = pd.DataFrame(list_A, columns=['value']) print(df['value']) df['value'] = df['value'].fillna(0) print(df['value']) ---------------------------------------------- 0 1.0 1 2.0 2 3.0 3 4.0 4 NaN 5 6.0 6 0.0 Name: value, dtyp..

Python/Python For Analytics 2020.02.16

[Python] padnas dataframe URL Decode

Pandas Dataframe에서 URL Decode from urllib.parse import unquote import pandas as pd example = ['%EC%95%88%EB%85%95%ED%95%98%EC%84%B8%EC%9A%94', '%EC%95%84%EB%A6%84%EB%8B%B5%EB%84%A4%EC%9A%94', '%ED%8C%8C%EC%9D%B4%EC%8D%AC'] df = pd.DataFrame(example, columns=['url']) # URL Decode df['url'] = df.url.apply(lambda x : unquote(x)) print(df) ------------------------------------------------------------..

Python/Python For Analytics 2020.02.16

[Python] numpy.where 를 이용하여 컬럼을 다양한 데이터 타입과 비교

numpy.where(조건문, True 값, False 값) Sample DataSet import pandas as pd import numpy as np lst_A = {'제품':['milk','juice','bread','icecream'], '수량':[3,5,10,2], '제조일시':['2020-01-01 01:00:00','2019-12-20 15:01:00','2019-12-31 00:00:00','2020-01-02 02:03:01']} df = pd.DataFrame(lst_A) df['제조일시'] = pd.to_datetime(df['제조일시']) 1. '2020-01-01 00:00:00'를 기준으로 "유통기간" 새로운 컬럼을 만들고, True : "유효", False : "만료" 체크..

Python/Python For Analytics 2020.01.21

[Python] pandas dataframe 리스트로 변환

list 데이터를 pandas dataframe으로 만들기 import pandas as pd lst_A = ['a','b','c','d', 'e', 1, 2] df = pd.DataFrame(lst_A) print(df) list 타입으로 변환 import numpy as np np.array(df[0].tolist()) ----------------------------- array(['a', 'b', 'c', 'd', 'e', 'f', 'g'], dtype='

Python/Python For Analytics 2019.10.30

[Python] numpy setdiff1d(차집합)을 이용한 2개의 텍스트 파일 비교

numpy.setdiff1d(array1, array2) : 2개의 array의 차집합 A_file.txt B_file.txt Tomatoes are red Bananas are yellow Strawberries are red Oranges are orange Blackberries are black Tomatoes are red Bananas are yellow Blackberries are black import pandas as pd import numpy as np df_A = pd.read_csv('A_file.txt', names=['data_A']) df_B = pd.read_csv('B_file.txt', names=['data_B']) list_A = np.array(df_A['data..

Python/Python For Analytics 2019.10.30

1 2 3 4

subprocess, 파이썬, if문, 리눅스, 리스트, linux, zabbix, WinSCP, iis, pandas, COS Pro 2급, COS Pro 2급 파이썬 모의고사, S3, cos pro 2급 모의고사, boto3, iis 로그, windows 2012 ntp server, access log, PYTHON, Webknight,

Today :
Yesterday :

DevOps Engineer

Python/Python For Analytics 33

티스토리툴바

« 2024/05 »
일	월	화	수	목	금	토
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30	31