WENXUAN TANG


About Me

Data Scientist @Mercatus Center|DS Master @Georgetown University|Looking for Data Scientist/Analyst, Machine Learning Engineer full-time position after May 2023

Location

Arlington, VA
Education
    Georgetown University
    August 2021 - May 2023
    degree
    Master's
    major
    Computer Science
    coursework
    Machine Learning, Deep Learning (NLP, Computer Vision), Big Data, Cloud Computing
    SUN YAT-SEN UNIVERSITY
    September 2016 - June 2021
Work Experience
    Joblogic-X, Data Scientist
    Plano, TX, US
    May 2021 - September 2022
    title
    Joblogic-X, Data Scientist
    overview
    - Credit Risk Analysis for Short-Term Loan Applications - Responsibilities include writing complex SQL queries for data extraction within a real-time Big Data environment, building fully automated ETL data pipelines to ensure data quality and integrity, applying advanced machine learning techniques - Built fully automated ETL data pipelines with Python(Airflow) and complex SQL joining and aggregating data from multiple - Applied advanced data cleaning techniques to handle missing values, outliers, distorted data distributions, etc. and ingest data under heavy workloads to ensure data integrity and consistency, improving data accuracy by 47 - Leveraged modern machine learning models (Logistic Regression, Support Vector Machine(SVM), Decision Tree with Pruning, Random Forest, Boosted Tree, Artificial Neural Network, etc.) to predict the probability of loan default and determine the optimal pricing terms/strategies balancing profitability and risk management. Models are being used in daily production and help the client company drive $11+ MM productivity annually - Developed centralized interactive dashboarding capability that summarized loan default behavior and risk management - Recommendation for Store Opening Site Locations for Meet Fresh - Developed robust ETL data pipelines extracting and merging data through multiple channels (Google Map Geocoding & Routing Service, Yelp API, SQL, CSV, etc.) 24/7 with zero human intervention, automation helps reduce manual labor by 97 - Cleansed and preprocessed raw data, followed by feature engineering on 100+ categorical features and numeric features - Built clustering model(Mini-Batch K-Means, DBSCAN), followed by model selection using performance metrics and identified potentially profitable locations. The model was utilized by the client company in deciding on locations of 20+ new
    KPMG
    Data Consulting and Digital Transformation Summer Intern
    Guangzhou, GD, People's Republic of China
    June 2020 - October 2020
    Allianz
    Data Scientist Summer Intern
    Guangzhou, GD, People's Republic of China
    June 2019 - October 2019
Volunteer
    CADSEA
Hobbies
Running (D.C. Half Marathon), Tennis (12 YoE), Traveling (25+ countries), Gardening
Github / Code
Powered By
WenxuanTang(view my full github here)
3 followers7 stars11 repositories
Technical Skills
  • python
  • sql