Alec Wong

Professional Experience

Progressive Casualty Insurance – Lead Data Analyst

2019 - Present | 2022 - Present (at Lead level)

Highlighted role achievements

  • Replaced tabular estimation of core team metrics with statistical models, reducing run-time and providing a framework to interpret variable effects and model parsimony.
  • Modeled claims adjuster productivity using zero-inflated negative binomial GLM using R package pscl as well as non-linear mixed-effects modeling with R package brms, developing theory for an alternative staffing criterion.
  • Forecasted risk of when claim ID numbers might exceed limits using Bayesian generalized linear regression with R package rstanarm, providing a recommended date for the start of a large-scale effort to increase claim ID size.
  • Increased efficiency by approximately 80 FTE using a data pipeline in python to quantify auto inspector drive times using location information and Google Maps Distance Matrix API.
  • Improved team efficiency by creating an interactive framework in the form of a web application for comparing adjustments to our staff model, abstracting a previously labor-intensive manual process.
  • Recast central team SQL dataset using delta compression to reduce storage size by over 5x and reduce analyst time spent writing lengthy and repetitive queries.
  • Experienced in several SQL frameworks (TSQL, DB2, Snowflake, SQLite, duckdb) creating complex queries joining several tables each with hundreds of millions of records.

Contributions additional to role

  • Top 3 finalist in the 2024 Inviztational, a company-wide visualization challenge, to present on January 7 2025 a dynamic visualization displaying satellite imagery of hurricanes over their lifespan.
  • Developed and taught a curriculum for a dual R and Python course to 25 attendees over a 9 month period.
  • Enforced code best-practices, introduced collaborative git workflows to our team, and checked critical code into Github repositories.
  • Developed several R and python packages including tools to eliminate boilerplate in SQL connections, extend ggplot2 with a company color palette, and other miscellaneous functions.
  • Spent over 100 hours between 2021 and 2024 providing one-on-one support to analysts all over the company supporting R, Python, and SQL questions.
  • Led bi-weekly R Clinic support sessions as the R SME for Progressive (ongoing).
  • Assisted in testing of Posit corporate productivity software, as well as an in-house python cloud platform.
  • Won 2nd place in SPrize 2021, a company-wide predictive modeling challenge, involving over 15 teams.

Cornell University – Graduate Research Assistant

2015 - 2018

  • Gained competency with Bayesian inference using Markov Chain Monte Carlo (MCMC) simulation, geostatistical modeling, maximum likelihood optimization procedures, and generalized linear models.
  • Developed two novel statistical models that use MCMC to estimate animal population size and relationships with spatial habitat covariates, applied to moose in New York.
  • The statistical models’ performance was tested via simulation analysis using high-performance computing techniques under an original dual-layer parallelization scheme of a cluster of computers.
  • Applied motivational and effective leadership leading field research teams of up to 10 personnel into wilderness conditions, and led laboratory discussions on the use of Git and R Markdown to improve organization of research.
  • Communicated results to statistical and ecological audiences nationally and internationally.

Education

Software

  • R (since 2014. Uses: statistical analysis, package development, web app development)
  • Python (since 2019. Uses: ETL, web API’s, automation, deep learning)
  • Git
  • SQL
  • Front-end development (HTML, CSS, JavaScript)
  • Tableau / Power BI
  • ArcGIS

Code Examples

Pytorch implementations for image and video

  • An exploration of neural network architectures for image and video.
  • Tasks explored: image generation/completion, regression from video input.
  • Key lessons: data loaders, model training, model evaluation, cuda integration, U-Net, network visualization

Advent of Code

  • Ranked second out of 14 participating data scientists and BI developers on the company leaderboard.
  • Primarily interested in solving puzzles with awk to enhance programming skills with linux tools.
  • Selected entries (2021):

Automated internet speed tests, hosted on my website

  • Source code.
  • Tests speeds daily and updates website.
  • Displays data with D3.

This resume!

  • Follows a make workflow to generate the output files. Just run make to compile everything!
  • Uses rmd, scss, pandoc, and wkhtmltopdf to create the document.