- Collaborated closely with the linguistics team to coordinate coding assignments
- Tested trained model's capability to assign unused tokens into English syntactic categories
- Conducted analysis of entropy data trends associated with word masking using BERT framework
- Created a proprietary model to categorically distinguish between animate and non-animate nouns
- Formatted 90,000+ lines of data extracted from research journals to compile a dataset for training the model