Zijian (Michael) Wang - Data Science & Statistics Portfolio

Zijian Wang's Portfolio

Thesis: “Mini-Max-Structured Neural Tangent Kernel in Estimating Average Treatment Effect Confounded by Image Covariate”, 2025

Conducted as my Honors Thesis under the guidance of Dr. David Hirshberg, this project explores how image-based confounding affects causal inference and introduces a novel two-step minimax estimator using Neural Tangent Kernel (NTK). The method formulates a balancing objective in reproducing kernel Hilbert space (RKHS), estimating inverse probability weights that minimize worst-case imbalance. Applications include observational treatment effect estimation in medical imaging and other high-dimensional visual settings.

Thesis PDF Defense Slide Github Poster

Project: “Clothing Segmentation and Virtual Try-On Pipeline”, 2025

In collaboration with Jessie Ni and Michael Cao, this project focused on semantic segmentation of fashion images using U-Net and FCN models to accurately extract clothing masks from multi-class datasets. The extracted segments were integrated into a virtual try-on system using GAN-based pipelines (e.g., VITON and TryOnGAN). Our work addressed challenges in garment alignment, resolution artifacts, and mask precision, producing realistic visual try-ons from raw clothing and model images. This project demonstrates the practical fusion of computer vision and fashion-tech, with potential applications in e-commerce and retail personalization.

PDF Github Poster

Research: “Bias Bound of Synthetic Control Estimator”, 2024

Guided by Dr. David Hirshberg. We developed a new perspective on bounding post-treatment bias in synthetic control methods. Our approach connects pre-treatment fit directly with post-treatment bias. Full derivation and comparisons with past methods are included in the appendix.

PDF

Project: “Behind Augusta, Georgia”, 2024

In collaboration with Jiaqi Chen and Humaira Tanzeem. We assessed the conditions of disadvantaged communities in Augusta, Georgia and how their environments have changed over time. Using GIS and SARIMA models, we presented findings through a public-facing interactive web platform. The project involved data scraping, time-series analysis, and environmental justice mapping.

Web Link

Project: “How Misogynistic is Rap? Lyrical and Comparative Analysis”, 2023

In collaboration with Indy Gu and Michael Cao. This project quantitatively assessed the presence of misogyny in music lyrics across genres, using Natural Language Processing (NLP), semantic similarity, and statistical analysis. We compared rap lyrics with misogynistic keyword vectors to determine genre-level trends.

PDF

Project: “The Atlanta Braves' Player Evaluations”, 2023

With Chris Paik and Sam Chen. We investigated the Atlanta Braves’ player contract strategy, focusing on their emphasis on young talent. Through statistical modeling, we analyzed traits valued by the Braves, examining profiles of players like Ronald Acuña Jr. and Ozzie Albies.

PDF

Project: “Warriors Report: 2022–2023 Season Analysis”, 2023

This analysis evaluated the Golden State Warriors’ fluctuating performance during the 2022–23 season using advanced stats and regression techniques. We examined Klay Thompson’s impact, first-quarter trends, shooting efficiency, bench versus starters, and Jordan Poole’s contribution.

PDF

Project: “What’s Wrong With This Golden State Warriors?”, 2022

A deep dive into the Warriors’ mid-season slump. This study used play-by-play data to investigate offensive/defensive efficiency, key player absences, and performance volatility. We also analyzed Curry’s shooting patterns and the defensive impact of Draymond Green.

PDF