Jun Clemente

Applied Data Scientist - M.S. Applied Data Science, USD

Turning complex data
into clear decisions

I build end-to-end data systems — from pipelines to deployed models — that turn complex, high-stakes data into decisions that hold up under scrutiny.

Selected Projects

Early Warning System for Student Outcomes

End-to-end ML classification framework across 958 California public high schools, evaluating 7 models under class imbalance using PR-AUC. Random Forest (PR-AUC 0.775) deployed in a live Streamlit app. Manuscript accepted for publication in the 2025 USD Capstone Chronicles.

Washington Traffic Data Pipeline

End-to-end real-time data pipeline ingesting traffic, weather, and incident data from WSDOT REST APIs into a cloud-hosted MySQL database on Azure, with automated ETL scheduling and a live Tableau dashboard.

Workplace Health Policy Optimization

Cloud-based predictive analytics pipeline evaluating the ROI impact of workplace health policies on productivity and absenteeism using public CDC, BLS, and County Health Rankings datasets.

School Sentiment NLP

NLP-based sentiment and topic analysis comparing Reddit discussions of high- and low-performing school districts (Palo Alto vs. Oklahoma City) to surface community perception patterns.