Machine Learning
Recent professional work: Slite’s Ask
Earlier projects: browser-based NLP for Overlay AI; various survey projects
My work over the past many years has turned increasingly towards elements of machine learning, both at the level of lighter tasks (clustering) and more in-depth NLP processing.
My earlier efforts in the space of machine learning or NLP—as an analyst for a large university—were centered on building, fine-tuning, clustering, and then serving robust vectors for topical analysis purposes, using simpler foundational NLP techniques (one of my NLP projects before transformers); see this summary. Following up on that work, I was occasionally tasked with taking large survey or other text data (e.g., a database of syllabi) and performing analysis for certain topics and themes of interest to the university administration.
Working in a consulting role for the startup Overlay AI meant diving deep into the latest state-of-the-art for NLP in a number of domains, and working to efficiently build an entire pipeline in a resource-constrained context—going from tokenization, though tagging, stemming, lemmatization, dependency parsing, and then applying top-level rules or vector-space comparisons, all within a chain of WASM-compiled tools running in a background WebWorker.
I have also expended a fair amount of effort in the domain of working with semantic graphs, particularly Wordnet. Given a desire to find topical connections across words and potentially disambiguate meanings, I have written code (in Rust) to traverse Wordnet’s relations in order to pick the most neighboring senses. I later moved those efforts into simply reducing the graph (using graph to vector learning) into a vector domain, gaining useful spatial representations of all Wordnet senses, which I later was able to put to use in obtaining smart synonyms or word replacements.
Over time, I adopted the increasingly dominant approach of extending transformer-based, pre-trained models (often BERT-based model, since they obtains excellent low-level vectors) and further tuning these to domain-specific tasks. Given the size of these models and my focus on web-embeddable learning, I am particularly invested in the research of model distillation through teacher-student or other methods for minification.
At Slite, I used my expertise to direct the architecture of a complete, AI-powered product feature called Ask, which uses a mixture of models and techniques in order to power a Q&A feature from a workspace of documents.