Loading
Keunyoung Yoon

Data Engineer

Data Scientist

Data Analyst

Keunyoung Yoon

Data Engineer

Data Scientist

Data Analyst

Hello, I’m

Data Engineer and Data Scientist

With over 9 years of experience in game data analytics, I have a strong background in Python, SQL, Apache Spark, and Airflow. I’ve led analytics for a game with over 2 million monthly active users and helped increase FIFA Online 4’s revenue by 10% year-over-year. My expertise in data-driven insights and collaboration with cross-functional teams has improved user retention by 12% and led to the development of 20+ real-time data dashboards, enabling faster decision-making across teams.

9

Years of Experience

2M

Senior Data Analyst for a game with MAU of

20+

Dashboards with Real-Time Analytics

Works

Professional Experience

Data Engineering
Data Engineering

Implementing data pipelines using Apache Kafka and Spark, optimizing real-time data flow. Managed large datasets efficiently and improved system response time by 20%, enabling faster decision-making. Proficient in Python, Airflow, and Docker for seamless data processing and automation.

Data Analytics
Data Analytics

Leveraging advanced analytics tools like Python and SQL to drive insights and decision-making. Increased user retention by 12% through in-depth user behavior analysis, and contributed to 10% revenue growth with targeted data-driven strategies. Expert in building real-time dashboards using Tableau.

Read More
Projects

Personal Projects

AI Chatbot Using Solar LLM
AI Chatbot Using Solar LLM

Developed a Document AI chatbot using Solar LLM and Kafka for efficient real-time retrieval and generation, with a focus on building a RAG (Retrieval-Augmented Generation) chatbot designed to handle complex documents and provide real-time, accurate answers.

Read More
Streamlining Data Processing
Streamlining Data Processing

A Dockerized PySpark and Elasticsearch Pipeline for Real-time Data Visualization: Utilized financial API data to create an end-to-end processing pipeline with Docker, PySpark, Elasticsearch, and Kibana for real-time financial data visualization.

Read more
Analyzing U.S. COVID-19 Data
Analyzing U.S. COVID-19 Data

Utilized U.S. COVID-19 data (2020–2023) to create a comprehensive real-time visualization pipeline using Docker, PySpark, Elasticsearch, and Kibana, focusing on healthcare insights and trends. The project highlighted key pandemic aspects, such as mortality rates and underlying health conditions, offering insights for public health decisions.

MMORPG Data Preprocessing And Visualization
MMORPG Data Preprocessing And Visualization

Python, Pandas, Seaborn, and Matplotlib used to preprocess and analyze large-scale MMORPG player data. Focused on identifying relationships between combat power progression and dungeon/quest engagement.

Read More
Predicting Monthly Login Days Using Regression Models
Predicting Monthly Login Days Using Regression Models

PyCaret and Gradient Boosting Regressor used to predict next month’s login days based on current attendance, spending, and combat power. Aimed to understand user engagement patterns by applying advanced regression techniques.

Read More
User Classification for MMORPG​
User Classification for MMORPG​

Utilized Python and libraries such as Pandas, PyCaret, Seaborn, and Matplotlib to conduct classification analysis on player combat power progression. Project aimed to predict future combat power levels and understand engagement with in-game activities like dungeon participation.

Read More
User Segmentation through Clustering
User Segmentation through Clustering

Utilized Python, Pandas, Matplotlib, and MiniBatch KMeans to perform clustering analysis on user data, identifying unique user segments based on metrics such as user level, quarterly attendance, quarterly payment, and combat power. Analysis supported retention event planning by focusing on user engagement levels.

Read More
Analyzing Top-Tier League of Legends Players​
Analyzing Top-Tier League of Legends Players​

Data Cleaning and Behavioral Analysis of Challenger and Professional Gamers: Leveraged the League of Legends API to clean and analyze gameplay data, focusing on the behaviors of top Challenger players and professional e-sports gamers. Conducted extensive data cleaning and visualized the findings using Python.

Probability-Based Item Simulation and Strategy Analysis
Probability-Based Item Simulation and Strategy Analysis

Developed a Python-based simulation to analyze a probability-based item in an MMORPG’s in-game store. The project aimed to evaluate different player strategies and assess their efficiency in earning points and rewards based on randomized draws.

Read More