Luke Hobbie

Villanova University | Analytics & Statistics | Turning Data into Insight

3.92 GPA | Dean's List All Semesters

Resume

Education

Villanova University
MS in Applied Statistics and Data Science
Expected Graduation: December 2027
BS in Comprehensive Science, Minor: Statistics
Expected Graduation: December 2026 | Dean's List (All Semesters) | GPA: 3.92/4.00

Selected Courses: Applied Statistical Models, Multivariable Calculus, Advanced Physics/Chemistry/Biology

Professional Experience

Trem (HealthTech R&D), Villanova University
June 2025 - July 2025
Data and Insights Intern | London, UK
  • Conducted comprehensive analysis on tremor-tracking technologies and data methodologies, formulating product development strategy for next-generation health monitoring devices
  • Researched clinical tremor scales and signal processing methods to determine optimal quantitative evaluation metrics for patient assessment
  • Improved backend data infrastructure using Supabase and SQL, enhancing database performance and data integrity across multi-tenant architecture
  • Collaborated with design team on layout and visuals to enhance clarity and usability of analytics dashboards for clinical practitioners
  • Pitched innovative product ideas for the RCA Innovation Launchpad presentation to potential investors and stakeholders
Alpha Phi Omega, Villanova University
January 2025 - Present
Social Committee Member | Lancaster, PA
  • Planned and coordinated events to improve member engagement and foster community building within the organization
  • Helped design and launch a service project to benefit wildlife preservation, coordinating volunteer efforts and resources
Harvey Cedars Bible Conference
June 2024 - August 2024
Manager of Activity Center | Harvey Cedars, NJ
  • Supervised and organized weekly recreational activities for over 100 participants, ensuring engaging and age-appropriate programming
  • Ensured safety protocols and inclusive participation for all attendees through careful planning and risk management

Extracurricular & Community Engagement

Sports Analytics Club, Villanova University
January 2024 - Present
  • Participate in weekly discussions on advanced sports metrics including WAR (Wins Above Replacement), EPA (Expected Points Added), APM (Adjusted Plus-Minus), and xG (Expected Goals)
  • Evaluate player and team performance through comprehensive data visualizations in R, utilizing ggplot2 and other analytics packages
Leadership STEM, Villanova University
August 2024 - Present
  • Selected for a competitive STEM leadership cohort focused on professional development, networking, and peer mentoring within the sciences
Special Olympics, Villanova University
November 2024 - Present
  • Served food to over 500 athletes and 1,000 volunteers during the world's largest student-run Special Olympics event, demonstrating commitment to community service
DREAMS (Discovering Resources Exploring Advanced Mathematics and Statistics), Villanova University
May 2024
  • Chosen as 1 of 2 first-year students to explore graduate-level research and career opportunities in statistics and data science
  • Engaged with faculty members to discuss mathematics research careers, methodologies, and pathways to graduate education

Technical Skills & Knowledge

Programming & Data Analysis

R: Statistical modeling, data visualization (ggplot2), data manipulation (dplyr, tidyr), ANOVA, confidence intervals, hypothesis testing, regression analysis

SAS: PROC procedures, data merging, statistical analysis, data management

SQL (PostgreSQL): Database design, Supabase implementation, RLS policies, multi-tenant database management, query optimization

Java: Object-oriented programming, applied data structures to enhance program efficiency and performance

Tools & Software

Microsoft Excel & Office: Advanced statistical functions, pivot tables, data charts, visualizations, and complex formula implementation

Data Visualization: Creating compelling visual narratives from complex datasets using R and Excel

Certifications

Bloomberg Market Concepts (BMC): Completed comprehensive modules in Economic Indicators, Currencies, Fixed Income, and Equities, demonstrating understanding of financial markets and data analysis

Interests & Passions

Sports Analytics: Passionate about applying statistical methods to evaluate athletic performance

Community Service: Active volunteer committed to making positive social impact

Research: Enthusiastic about exploring new statistical methodologies and data-driven problem solving

Data Analytics Projects

Global Happiness Index: Analyzing Economic and Social Predictors of National Well-being

Project Overview: Conducted a comprehensive analysis of the World Happiness dataset to identify key economic and social factors that predict national happiness levels across 137+ countries in 2023.

Tools & Technologies: R (tidyverse, ggplot2, dplyr), Statistical Analysis, Data Visualization

Methodology:

  • Performed exploratory data analysis on multiple happiness indicators including GDP per capita, life expectancy, social support, freedom of choice, and corruption levels
  • Created visualizations to identify distribution patterns and outliers across demographic variables
  • Conducted correlation analysis between happiness scores and economic/social factors using scatter plots and regression trend lines
  • Segmented countries by social support levels to analyze differential impacts of economic factors
  • Performed comparative analysis of happiness trends across multiple countries over time

Key Findings:

  • Strong Positive Correlations: Log GDP per capita, life expectancy, and social support all showed strong positive relationships with national happiness levels
  • Freedom Impact: Countries with high freedom of choice consistently reported higher happiness scores compared to low-freedom countries
  • Corruption Effect: Identified an inverse relationship between perceived corruption and happiness - countries with lower corruption rates reported significantly higher well-being
  • Extreme Cases: Afghanistan ranked lowest (1.45 happiness score) with 55.2 life expectancy and 73.8% corruption perception, while Finland topped the rankings (7.70 happiness) with 71.3 life expectancy and only 18.5% corruption perception
  • Economic Context Matters: When segmented by social support levels, the relationship between GDP and happiness varied, suggesting that strong social networks can buffer economic disparities

Business Applications: This analysis demonstrates skills applicable to market research, policy analysis, economic forecasting, and social impact assessment - crucial for organizations making international investment decisions or developing social programs.

Technical Highlights:

  • Data filtering and transformation using dplyr functions (group_by, summarize, mutate)
  • Advanced ggplot2 visualizations including histograms, scatter plots with trend lines, boxplots, and multi-series line charts
  • Statistical methods including median calculations, categorical variable creation, and comparative analysis
  • Created reproducible analysis workflow with clear documentation

Pennsylvania Education and Income Analysis: Exploring Socioeconomic Patterns by Zip Code

Project Overview: Conducted a comprehensive geospatial analysis combining Pennsylvania college graduation rates with median household income data across 1,690+ zip codes to identify socioeconomic patterns and educational attainment relationships.

Tools & Technologies: SAS (PROC IMPORT, PROC MEANS, PROC UNIVARIATE, PROC SGPLOT), Data Management, Statistical Analysis

Methodology:

  • Imported and merged two large datasets: Pennsylvania graduation rates by zip code and national median income data
  • Performed extensive data cleaning including standardizing zip code formats, handling missing values, and validating geographic consistency
  • Created three analytical datasets through strategic data partitioning: matched records (1,690 zip codes), records with missing income data (33 zip codes), and records without graduation data
  • Segmented college graduation rates into quartile-based groups (Low, Med-Low, Med-High, High) using PROC UNIVARIATE for distribution analysis
  • Calculated aggregate statistics by graduation group using PROC MEANS and PROC TABULATE
  • Visualized relationships between population density, income levels, and educational attainment using scatter plots

Key Findings:

  • Data Coverage: Successfully matched 1,690 Pennsylvania zip codes between education and income datasets, representing comprehensive statewide coverage
  • Missing Data Patterns: Identified 33 zip codes with missing income data, primarily in low-population areas across Texas and California where income data collection was challenging
  • Exponential Income Relationship: Discovered a strong exponential relationship between college graduation quartiles and median income - areas with higher graduation rates showed disproportionately higher income levels
  • High-Achiever Outlier: The "High" college graduation group exhibited significantly elevated income and population metrics compared to other quartiles, suggesting clustering of educated populations in specific geographic areas
  • Progressive Graduation Impact: Each successive quartile (Low → Med-Low → Med-High → High) demonstrated measurable increases in both median income and population density, validating education's role in economic development

Business Applications: This analysis provides valuable insights for educational policy makers, real estate developers, and economic development organizations seeking to understand the relationship between educational infrastructure and regional prosperity. The methodology demonstrates skills in data integration, geographic analysis, and identifying actionable socioeconomic trends.

Technical Highlights:

  • Complex data merging using SAS BY-group processing with multiple datasets
  • Custom formatting and labeling for categorical variables using PROC FORMAT
  • Statistical distribution analysis using PROC UNIVARIATE to determine meaningful quartile breakpoints
  • Conditional logic implementation for data categorization and subsetting
  • Summary statistics generation with OUTPUT statements for downstream analysis
  • Professional data visualization using PROC SGPLOT scatter plots