- Finetuned Open AI LLM model according to patient's data and built a conversational chatbot using LangChain Framework and Streamlit library. Created RAG pipelines, Retrievers, Agents and Chains for automation using Python in Jupyter Notebook
- Made use of Flowise AI GUI for Loading, Splitting and Embedding documents into Vector DB (Pinecone, Chroma, FAISS) and deployed workflows for applications involving LLMs, specifically focusing on LangChain
- Performed Exploratory Data Analysis (EDA) in JupyterLab over Electronic Health Records (EHR) to identify data patterns, clusters and correlation between features and visualized the outputs as reports and charts in Power BI
- Implemented Feature Selection (Lasso and Ridge Regressors and PCA) and Predictive models (Random Forest, XGBoosting, ANN, ARIMA, SARIMA) using Python Libraries (Numpy, Panda, Scikit-Learn, Matplotlib, Seaborn, Statsmodels, OpenAI etc.) to predict
- Tumor marker scores from 6k+ Audio features, and Prompt Engineering over OpenAI LLM for Feature Extraction
- Collected data related to EHR features from various applications using Storyline APIs and stored them as RDB in AWS S3
- Constructed Data Architecture, Transformed and Modelled data from S3 using AWS Glue and stored them in Redshift
- Automated this process using Lambda triggers and CloudWatch