Jash Shah


Location

Jersey City, NJ
Education
    Stevens Institute of Technology
    September 2023 - May 2025
    degree
    Master's
    major
    Data Science
    coursework
    machine learning
    Gujarat Technological University
    June 2019 - May 2023
Work Experience
    Johnson AND Johnson
    Data Scientist
    Jersey City, NJ, United States, 07399
    July 2024 - present
    company
    Johnson AND Johnson
    title
    Data Scientist
    overview
    Part-time Contract Fine-tuned Large Language Models (LLMs) using AWS Bedrock to enhance risk prediction accuracy and reducing patient readmission rates by 18%, optimizing clinical decision support systems. • Engineered a semantic search engine over unstructured clinical notes using Retrieval- Augmented Generation (RAG), fine-tuned LLMs and Hugging Face cross-encoders to re- rank results, accelerating critical patient information retrieval for diagnosis. • Leveraged Apache Spark and Databricks to process large volumes of healthcare data from multiple sources, reducing ETL processing time by 40% and enhancing performance for real-time analytics. • Designed and automated ETL pipelines using Apache NiFi and Talend to seamlessly integrate Electronic Health Records (EHR) and Electronic Medical Records (EMR), increasing data freshness and integrity by 25%. • Enforced HIPAA-compliant data governance strategies, including encryption, anonymization and audit trails, to ensure secure handling and privacy of Protected Health Information (PHI) across the data lifecycle. • Consolidated healthcare datasets using AWS S3 for scalable storage and Amazon Redshift for high-performance analytical queries, achieving 99% data availability for downstream reporting and machine learning applications. • Developed predictive dashboards with Power BI and Plotly to provide clinicians with dynamic, risk-stratified insights into patient populations, treatment effectiveness and operational trends. • Managed end-to-end MLOps lifecycle using MLflow, automating model tracking, version control and deployment, which led to a 15% decrease in model deployment cycle time. • Built and deployed RESTful APIs using FastAPI to integrate machine learning models into clinical applications, enabling real-time inference to support physicians with personalized care recommendations. • Ensured end-to-end regulatory compliance with FHIR, HL7 & ICD standards, maintaining interoperability and standardized communication across diverse healthcare data systems. • Applied advanced statistical modeling techniques such as Bayesian inference and multivariate analysis using SAS and SPSS to support clinical research and evaluate patient treatment outcomes.
    Deloitte
    Data Scientist
    Indianapolis, IN, United States, 46298
    June 2021 - July 2023
Skills
Languages
EnglishFrenchGermanSpanish
Skills
A/B TestingAirflowAmazon RedshiftAmazon S3Amazon Web ServicesAnonymizationApache HadoopApache HBaseApache KafkaApache NifiApache SparkApplication Programming Interfaces (APIs)Artificial IntelligenceAuditing SkillsBig DataBusiness SoftwareBusiness StrategiesClinical Decision SupportClinical ResearchClinical WorksCloud ComputingCommunication SkillsComputer EngineeringCryptographyCustomer RetentionCustomer SatisfactionCycle Time VariationDashboardsData AnalysisDatabricksData GovernanceData IntegrationData ScienceData Storage TechnologiesData VisualizationDeep LearningDevOpsDistributed SystemsDockerElectrical TransformersElectronic Medical RecordsExecution of ExperimentsExtract Transform Load (ETL)FastapiFast Healthcare Interoperability ResourcesFlask (Web Framework)Fraud Prevention and DetectionGitGithubHealth CareHealth Insurance Portability and Accountability Act ComplianceHealth Level Seven InternationalImplantable Cardioverter-DefibrillatorInvestment DecisionsJava (Programming Language)JenkinsKnowledge of FinanceKnowledge of StatisticsKubernetesLarge Language ModelsLeadershipMachine LearningMachine Learning OperationsMarket SegmentationMarket TrendsMatplotlibMedical RecordsMetricsMicrosoft AzureMoney InvestmentsMonte Carlo MethodsMultivariate AnalysisNatural Language ProcessingNLTK (NLP Analysis)Patient Information LeafletsPlotlyPortfolio ManagementPower BIPredictive Data AnalysisProgramming LanguagesProtected Health InformationPysparkPython (Programming Language)Query PerformanceRegulatory ComplianceReliabilityRestful APIsRisk AnalysisR (Programming Language)SAS (Software)ScalabilitySearch EnginesSemanticsSimulationsSocial MediaSoftware Version ControlSpacySPSS (Software)SQL DatabasesStock ControlStrategic ManagementStrategies of MarketingSuccess Driven PersonTableau (Software)TalendTensorflowTesting SkillsTime SeriesTransaction DataXgboost
Hobbies
pickle ball
Cricket
Volleyball
Reading
Traveling
Camping
beaches
Cooking
exploring new cities
Github / Code
Powered By
Jash-stack(view my full github here)
0 followers0 stars11 repositories