<aside>

</aside>


๐Ÿ“– ์ฑ•ํ„ฐ ์†Œ๊ฐœ

์ง€๊ธˆ๊นŒ์ง€ ์šฐ๋ฆฌ๋Š” ์ตœ๊ณ ์˜ ๋ชจ๋ธ์„ ์ฐพ๊ณ , ๊ทธ ์„ฑ๋Šฅ์„ ๊ทนํ•œ๊นŒ์ง€ ๋Œ์–ด์˜ฌ๋ ธ์Šต๋‹ˆ๋‹ค. Rยฒ ์ ์ˆ˜ 0.8824! ์ •๋ง ํ›Œ๋ฅญํ•œ ์„ฑ๊ณผ์ฃ . ํ•˜์ง€๋งŒ ์ด ์ˆซ์ž ํ•˜๋‚˜๊ฐ€ ๋ชจ๋ธ์˜ ๋ชจ๋“  ๊ฒƒ์„ ๋งํ•ด์ฃผ์ง€๋Š” ์•Š์Šต๋‹ˆ๋‹ค. ๋ชจ๋ธ์€ ์–ด๋–ค ๋ฐ์ดํ„ฐ์—์„œ ์‹ค์ˆ˜๋ฅผ ํ•˜๊ณ , ์–ด๋–ค ํŠน์ง•(feature)์„ ์ค‘์š”ํ•˜๊ฒŒ ์ƒ๊ฐํ• ๊นŒ์š”? ์ด๋ฒˆ ์‹œ๊ฐ„์—๋Š” ์ˆซ์ž ๋„ˆ๋จธ์— ์žˆ๋Š” ๋ชจ๋ธ์˜ ์ง„์งœ ์ด์•ผ๊ธฐ๋ฅผ ๋“ค์–ด๋ณด๋Š” ์‹œ๊ฐ„์ž…๋‹ˆ๋‹ค. plot_model๊ณผ evaluate_model์ด๋ผ๋Š” ๊ฐ•๋ ฅํ•œ ์‹œ๊ฐํ™” ๋„๊ตฌ๋ฅผ ์‚ฌ์šฉํ•ด ๋ชจ๋ธ์˜ ๋‚ด๋ฉด์„ ๊นŠ์ˆ™์ด ๋“ค์—ฌ๋‹ค๋ณด๋Š” '๋ชจ๋ธ ํƒ์ •'์ด ๋˜์–ด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.


๐ŸŽฏ ์ฑ•ํ„ฐ ๋ชฉํ‘œ


๐Ÿ’ป ์ด๋ฒˆ ์ฑ•ํ„ฐ์˜ ์ „์ฒด ์ฝ”๋“œ ๋ฐ ํ”„๋กœ์ ํŠธ ๊ตฌ์กฐ

์ด๋ฒˆ ์ฑ•ํ„ฐ์˜ ํ•ต์‹ฌ ์ฝ”๋“œ

๐Ÿ’ก 7๊ฐ•์—์„œ ์™„์„ฑํ•œ tuned_gbr_model์„ ์‚ฌ์šฉํ•˜์—ฌ, ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ๊ณผ ํŠน์„ฑ์„ ์‹œ๊ฐ์ ์œผ๋กœ ๋ถ„์„ํ•ฉ๋‹ˆ๋‹ค.

# 1. ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ๋ฐ ํŠœ๋‹๋œ ๋ชจ๋ธ ์ค€๋น„ (1~7๊ฐ• ๋‚ด์šฉ)
from pycaret.datasets import get_data
from pycaret.regression import setup, create_model, tune_model, plot_model, evaluate_model

# ๋ฐ์ดํ„ฐ ๋กœ๋“œ ๋ฐ ์„ค์ •
df = get_data('insurance')
setup(data=df, target='charges', session_id=123, fold_shuffle=True)

# ๋ชจ๋ธ ์ƒ์„ฑ ๋ฐ ํŠœ๋‹
base_model = create_model('gbr')
tuned_gbr_model = tune_model(base_model, optimize='R2', n_iter=200)

# 2. ๊ฐœ๋ณ„ ํ”Œ๋กฏ ์ƒ์„ฑํ•˜๊ธฐ (๋ณด๊ณ ์„œ/์ €์žฅ์šฉ)
# ์ž”์ฐจ ํ”Œ๋กฏ (Residuals Plot)
plot_model(tuned_gbr_model, plot = 'residuals')

# ํŠน์„ฑ ์ค‘์š”๋„ ํ”Œ๋กฏ (Feature Importance Plot)
plot_model(tuned_gbr_model, plot = 'feature')

# 3. ์ธํ„ฐ๋ž™ํ‹ฐ๋ธŒ ๋Œ€์‹œ๋ณด๋“œ ์‹คํ–‰ (ํƒ์ƒ‰์šฉ)
# Jupyter Notebook ํ™˜๊ฒฝ์—์„œ ์‹คํ–‰ํ•ด์•ผ UI๊ฐ€ ๋‚˜ํƒ€๋‚ฉ๋‹ˆ๋‹ค.
evaluate_model(tuned_gbr_model)

์ฝ”๋“œ ์‹คํ–‰ ๊ฒฐ๊ณผ ๋ฏธ๋ฆฌ๋ณด๊ธฐ

plot_model(plot='residuals') ์‹คํ–‰ ๊ฒฐ๊ณผ

Graph

แ„‰แ…ณแ„แ…ณแ„…แ…ตแ†ซแ„‰แ…ฃแ†บ 2025-09-01 แ„‹แ…ฉแ„’แ…ฎ 10.42.32.png

plot_model(plot='feature') ์‹คํ–‰ ๊ฒฐ๊ณผ

Graph

แ„‰แ…ณแ„แ…ณแ„…แ…ตแ†ซแ„‰แ…ฃแ†บ 2025-09-01 แ„‹แ…ฉแ„’แ…ฎ 10.42.39.png