Spaces:

mkschulz9
/

personal-chatbot

Sleeping

App Files Files Community

personal-chatbot / data /knowledge_base.json

mkschulz9

refactor: update KB and sample Qs

e02e175 2 months ago

raw

history blame contribute delete

14.2 kB

	{
	"0": {
	"title": "Lead Software Engineer at SchedGo (now EduRoute)",
	"dates": "June 2022 - September 2023",
	"topics": "startup, software engineering, leadership, full-stack, front-end, back-end, REST API, data schemas, CRUD services, React, TypeScript, React Joyride, Google Firestore (NoSQL), Git, degree planning, Big Bang Business Competition",
	"details": [
	"Led a team of five software engineers and two UI/UX designers to build a full-stack college degree planning web application that helps students map four-year academic plans.",
	"The application became the startup’s core product and contributed to winning the $30,000 grand prize at UC Davis' Big Bang Business Competition 2023.",
	"Defined product vision, delegated tasks, oversaw UI/UX, and maintained timelines; communicated progress and goals to the business team.",
	"Independently designed and implemented three REST API endpoints, 10+ data schemas, and CRUD services to improve scalability and efficiency.",
	"Implemented front-end features including a guided tutorial (React Joyride) and a text parsing workflow for extracting user information from raw text and PDFs.",
	"Stack: TypeScript, React, Google Firestore (NoSQL), Git (collaboration and code management)."
	]
	},
	"1": {
	"title": "Machine Learning Researcher at USC AutoDrive Lab",
	"dates": "January 2024 - May 2025",
	"topics": "machine learning research, autonomous driving, motion planning, benchmarking, reproducibility, hyperparameter tuning, PyTorch, NuPlan, CARLA, Waymo",
	"details": [
	"Analyzed top autonomous driving competition submissions (NuPlan, CARLA, Waymo), synthesized solution approaches, and presented actionable findings to the lab.",
	"Reproduced and validated promising methods within the lab’s simulation environment; benchmarked against prior baselines under standardized scenarios.",
	"Performed hyperparameter sweeps and implementation extensions to improve planner robustness and generalization.",
	"Examined and reproduced approaches including GameFormer Planner and learning-based motion planning variants (e.g., insights from “Parting with Misconceptions about Learning-Based Vehicle Motion Planning”); documented learnings for lab adoption."
	]
	},
	"2": {
	"title": "Software Integration Engineer Intern at NASA Deep Space Network, Peraton",
	"dates": "June 2024 - August 2024",
	"topics": "internship, full-stack, web application, SQLite, sql.js, React, TypeScript, developer experience, user research, Vite, pnpm, ESLint, Prettier, Git hooks",
	"details": [
	"Redesigned and rebuilt an internal dashboard to retrieve application data and configurations (database info, assigned employees, settings) efficiently.",
	"Measured outcomes via pre/post user interviews and surveys: average satisfaction increased from 3 to 8 (>150% improvement).",
	"Built the front end with React + TypeScript; integrated sql.js to run a fully in-browser SQLite database with query execution and offline persistence via downloadable files.",
	"Communicated prototype limitations clearly (no backend auth/access control; not production-secure) to set appropriate expectations.",
	"Collaborated with management and engineering for requirements; supervised another intern with task delegation and code reviews.",
	"Established a streamlined developer workflow using Vite and pnpm; enforced quality with ESLint, Prettier, and Git hooks."
	]
	},
	"3": {
	"title": "Multi-Label Emotion Classification in Text using Transformer Models for Feature Extraction",
	"dates": "January 2025 - February 2025",
	"topics": "multi-label classification, NLP, emotion detection, embeddings, transformers, BERT, DistilBERT, RoBERTa, TF-IDF, binary relevance, class imbalance, F1 score, PyTorch, Hugging Face",
	"details": [
	"Developed a pipeline for multi-label emotion classification; explored TF-IDF baselines and transformer-based feature embeddings.",
	"Chose F1 as a core metric to address pronounced class imbalance (per-emotion positive rates and positive/negative ratios).",
	"Performed basic preprocessing (lowercasing, contraction expansion, whitespace cleanup); staged stopword removal and lemmatization for later ablations.",
	"Used an 80/20 stratified split to maintain label distributions across train/test.",
	"Established a TF-IDF baseline with binary relevance using logistic regression and SVM (F1 ≈ 13%).",
	"Extracted DistilBERT and RoBERTa embeddings; trained simple classifiers with multiple loss functions (log, hinge, modified Huber, perceptron).",
	"Best model: DistilBERT embeddings + perceptron loss achieved ~36% F1 (≈23-point absolute improvement over baseline).",
	"Planned next steps: compare Word2Vec/GloVe vs. transformer embeddings; evaluate MLP/LSTM/GRU; consider fine-tuning a transformer on-task.",
	"GitHub: https://github.com/mkschulz9/multi-label-text-classification"
	]
	},
	"4": {
	"title": "Associate of Science in Computer Science at Diablo Valley College (DVC)",
	"dates": "September 2018 - May 2021",
	"topics": "A.S., education, computer science, Diablo Valley College, GPA, calculus, programming, foundational CS",
	"details": [
	"Associate of Science in Computer Science, Diablo Valley College (Bay Area), May 2021; GPA 3.57.",
	"Completed: Calculus I–III, Linear Algebra, Differential Equations, Discrete Mathematics, Object-Oriented Programming in C++, Program Design & Data Structures.",
	"Began as a business major; pivoted to CS after discovering a better fit and strong interest in building practical solutions."
	]
	},
	"5": {
	"title": "Bachelor of Science in Computer Science at University of California, Davis (UC Davis)",
	"dates": "September 2021 - June 2023",
	"topics": "B.S., education, computer science, UC Davis, GPA 3.8, AI/ML coursework, systems, HCI",
	"details": [
	"Bachelor of Science in Computer Science, UC Davis, June 2023; GPA 3.8.",
	"Courses: Computer Architecture, Theory of Computation, Operating Systems, Computer Networks, Probability, Statistics, Web Programming, Artificial Intelligence, Machine Learning, Deep Learning, Programming Languages, Human-Computer Interaction.",
	"Discovered a deep interest in AI/ML and also enjoyed web programming for its rapid idea-to-UI feedback loop."
	]
	},
	"6": {
	"title": "Master of Science in Computer Science (Focus: AI/ML) at University of Southern California (USC)",
	"dates": "August 2023 - May 2025",
	"topics": "M.S., graduate education, USC, AI, ML, computer science, education",
	"details": [
	"Graduated May 2025 with an M.S. in Computer Science (AI/ML focus).",
	"Courses: Deep Learning, Machine Learning, Database Systems, Applied NLP, Advanced Computer Vision, Large-Scale Optimization for ML, Design and Analysis of Algorithms, Foundations of AI.",
	"Expanded technical depth in ML while growing a network of peers and mentors focused on AI-driven products."
	]
	},
	"7": {
	"title": "Software Engineer, GenAI/Agentic/GraphML Applications — Visa (EOR/APFD)",
	"dates": "2025 - Present",
	"topics": "Visa, payments risk, fraud detection, GenAI, agentic AI, LangGraph, LangChain, Model Context Protocol, MLOps, reliability, observability, graphML, graph ML",
	"details": [
	"Builds AI applications to detect, prevent, and mitigate fraud across Visa’s prepaid-card ecosystem.",
	"Designs multi-agent tool-use patterns with strong guardrails (traceability, auditability) using LangGraph for compliance-oriented environments.",
	"Builds graph ML powered applications for detecting fraud rings in Visa's prepaid card network.",
	"Explores MCP servers and orchestration best practices to improve modularity, safety, and maintainability."
	]
	},
	"8": {
	"title": "NBA Player Stats Predictor — Next-Game Multi-Target Regression",
	"dates": "2024 - 2025",
	"topics": "sports analytics, ensemble learning, LightGBM, XGBoost, graph neural networks, quantile regression, Monte Carlo simulation, NBA API, Basketball Reference, time-aware validation, uncertainty quantification, production pipeline",
	"details": [
	"Ensemble combining LightGBM, XGBoost, and optional GNN residuals with stacking; probabilistic outputs via quantile regression (Q10/Q50/Q90).",
	"Two-stage compositional modeling: minutes prediction (LightGBM quantile) plus multi-target per-minute rates; final per-game stats computed as minutes × per-minute rates.",
	"Lineup projection via Monte Carlo (100–1000 samples) to model injury uncertainty, rotation patterns, and team pace/efficiency; generates lineup-conditional features.",
	"Real-time integration of NBA API and Basketball Reference, enriched with injury reports, venue altitude, travel/rest penalties, and scheduling context.",
	"Time-aware validation preventing leakage and slice-based evaluation; typical results: minutes MAE ≈ 3.2 with ~79–80% 80% PI coverage; multi-stat 80% PI coverage ≈ 76–84%.",
	"Configurable production pipeline and CLI (ingest, run-day, train, stack, eval, predict, lineup) with YAML configs, reproducible seeds, and JSON/CSV/Parquet outputs.",
	"Graph-based features over player–team–game relations via GNNs; optional residual correction and meta-learning with out-of-fold predictions."
	]
	},
	"9": {
	"title": "PCB Trace Length Extractor — Graph Pathfinding on Board JSON",
	"dates": "2024 - 2025",
	"topics": "graph algorithms, Dijkstra, geometry, Shapely, dataclasses, CAD parsing, testing",
	"details": [
	"Parses JSON-encoded PCB objects and applies shortest-path algorithms to compute trace lengths accurately.",
	"Improves geometric robustness (segment stitching, tolerances) and adds regression tests to prevent numerical regressions."
	]
	},
	"10": {
	"title": "Growth Focus: ML Ethics & Productionizing ML at Scale",
	"dates": "Ongoing",
	"topics": "ML ethics, responsible AI, governance, security, MLOps, monitoring, data quality, rollback",
	"details": [
	"Identified a need to deepen expertise in responsible AI and end-to-end production ML.",
	"Actions: builds evaluation harnesses for agentic systems (toxicity/safety checks, tool-use audits), adopts model/data versioning, and adds observability (latency, failure modes, guardrail triggers).",
	"Pursues targeted certifications/courses (Responsible AI, MLOps) and applies patterns directly to Visa prototypes."
	]
	},
	"11": {
	"title": "Hobbies & Interests",
	"dates": "Ongoing",
	"topics": "outdoors, food, fitness, friends and family, gym, personal projects, real estate, entrepreneurship, NBA, local events",
	"details": [
	"Enjoys being outdoors (day trips, hikes, time around lakes and parks) and exploring local food spots with friends and family.",
	"Regular gym routine focused on strength and conditioning; values consistency and measurable progress.",
	"Builds personal projects (agentic AI, analytics, tooling) to learn by shipping and to test ideas quickly.",
	"Active interest in real estate (deal analysis, market comps, lead scoring) and entrepreneurship (turning projects into products).",
	"Follows the NBA and attends local events (concerts, sports) when possible."
	]
	},
	"16": {
	"title": "Strengths & Weaknesses (with Active Remediation)",
	"dates": "Ongoing",
	"topics": "technical leadership, end-to-end ownership, agentic AI, MLOps, reliability, type-safety, communication, entrepreneurship, responsible AI, scope management, UI/UX polish, academic writing",
	"details": [
	"Strength — Technical leadership: sets clear roadmaps, decomposes ambiguity into milestones, and drives cross-functional delivery.",
	"Strength — End-to-end ML systems: builds from data ingestion and modeling to orchestration, evaluation, and deployment with strong observability.",
	"Strength — Agentic AI proficiency: designs tool-use/guardrail patterns (traceability, auditability), ensembles (LightGBM/XGBoost/GNN), and uncertainty-aware predictions.",
	"Strength — Code quality & reproducibility: mypy, Ruff, pre-commit, dataclasses, config-driven pipelines, seeded runs, and CI-friendly CLIs.",
	"Strength — Communication & product sense: concise docs, slice-based eval reports, and stakeholder summaries that inform decisions; bias to action and iteration.",
	"Weakness — Responsible AI depth: expanding bias/harms analysis and policy alignment; Remediation: automated safety checks, red-team evaluations, and standardized model cards.",
	"Weakness — Productionization at scale: reducing fragility under load; Remediation: SLAs/SLOs, canary deploys, rollback runbooks, versioned data/features, drift detection with scheduled retrains.",
	"Weakness — Scope management: tendency to over-scope early; Remediation: milestone slicing, crisp acceptance criteria, and weekly burn-downs tied to measurable outcomes.",
	"Weakness — UI/UX polish: functional UIs can be sparse; Remediation: adopt component libraries, heuristic reviews, and quick user tests before ship.",
	"Weakness — Long-form/academic writing speed: Remediation: structured outlines, citation managers, time-boxed drafting, and iterative reviews."
	]
	}
	}