Data Science & Machine Learning
Kaggle competitor. Graph neural network researcher. Production ML engineer. I don't just build models—I build winning systems.
Kaggle Competitor
I compete where the best data scientists in the world prove their worth. Kaggle isn't academic—it's survival of the fittest algorithms. My profile: kaggle.com/brianedwards
Why Kaggle Matters:
Anyone can claim ML expertise. Kaggle rankings are proof. You're competing against thousands of PhD researchers, industry veterans, and AI labs. Every competition teaches techniques that work in the real world—not just in papers.
March Machine Learning Mania
The annual NCAA tournament prediction competition. 68 teams, millions of possible brackets, and the chaos of March Madness. My system combines historical performance data, advanced basketball metrics, and ensemble learning to predict upset probabilities with precision.
Statistical Foundation
Ken Pomeroy metrics, strength of schedule adjustments, tempo-free efficiency ratings
Ensemble Models
XGBoost, LightGBM, neural networks—stacked for optimal bracket predictions
Upset Detection
Specialized models for identifying bracket-busting upsets before they happen
Reproducible Pipeline
Full end-to-end automation from data ingestion to submission generation
Conduit - Kaggle Competition Pipeline
A battle-tested pipeline for Kaggle competitions. From raw data to leaderboard submission, Conduit handles the tedious infrastructure so you can focus on feature engineering and model experimentation. Built from lessons learned across dozens of competitions.
# Typical Conduit workflow
conduit init my-competition
conduit feature add momentum_indicators
conduit train --model xgboost --cv 5
conduit submit --ensemble best_3
Key Features:
- - Automated cross-validation with stratification
- - Feature importance tracking and selection
- - Experiment logging with MLflow integration
- - Ensemble creation and blending utilities
World Bank Development Indicators
Global development data analysis using Polars—the blazing-fast DataFrame library that makes pandas feel like a horse-drawn carriage. This project processes decades of economic indicators across 200+ countries with the speed and efficiency that modern data science demands.
Development Indicators
Countries & Regions
Years of Data
Why Polars?
10-100x faster than pandas. Lazy evaluation. Multi-threaded by default. When you're processing billions of data points, speed isn't a luxury—it's survival.
Graphyard - Graph Analysis Tools & Blog
The world is graphs. Social networks, knowledge bases, molecular structures, supply chains— everything interesting is connected. Graphyard is my laboratory for graph algorithms, network analysis, and the deep mathematical structures that underpin complex systems.
Graph Algorithms
PageRank, community detection, centrality measures, shortest paths at scale
Network Visualization
Interactive force-directed layouts, hierarchical structures, temporal evolution
Technical Blog
Deep dives into graph theory, algorithm implementations, real-world applications
Open Source Tools
Reusable libraries for graph processing and analysis
Graph Neural Networks (GNN)
When traditional neural networks meet graph structures, magic happens. GNNs learn representations that capture both node features and structural relationships. My implementations push the boundaries of message passing, attention mechanisms, and scalable training on massive graphs.
Applications:
- - Drug discovery and molecular property prediction
- - Social network analysis and influence propagation
- - Recommendation systems with relational data
- - Fraud detection in financial transaction networks
GYAT - Graph Predictive Attention Network
My custom architecture combining graph attention mechanisms with predictive self-supervised learning. GYAT learns rich node embeddings by predicting masked graph structures—like BERT, but for networks. The attention mechanism dynamically weights neighbor contributions based on learned relevance.
Multi-Head Attention
Parallel attention heads capture diverse relationship types
Predictive Pretraining
Self-supervised learning on graph structure for robust representations
Scalable Training
Mini-batch sampling for graphs with millions of nodes
Research Frontier:
GYAT represents the cutting edge of graph representation learning. This isn't off-the-shelf ML—it's original research applied to real problems.
Why My Data Science is Different
Battle-Tested
Kaggle competitions are the proving ground. I've competed against the best and learned what actually works—not what looks good in a paper.
Production-Ready
Notebooks are prototypes. I build pipelines that run in production, scale with your data, and don't break at 3 AM.
Full Stack ML
Data engineering, feature stores, model training, deployment, monitoring— I own the entire lifecycle, not just the modeling phase.
Research-Grade
Custom architectures like GYAT aren't just academic exercises. They're competitive advantages when off-the-shelf solutions plateau.
Need ML That Actually Works?
Whether you need a Kaggle-winning prediction system, a production ML pipeline, or custom research—I deliver results, not just models.