• Built company embedding space for LinkedIn’s ~20 million organizations
• Executed data preprocessing in Scala with Apache Spark to create directed weighted graph with ~100 million edges
• Trained word2vec (gensim) and GNN (graphSAGE) models to produce company embeddings in 50-d using TensorFlow
• Used objective (e.g. logistic regression) and subjective (Appen) evaluation on embeddings