Python

Python has long played a key role in my work. In earlier years working at the edge of data science for a large institution, I employed python & its notebooks as a home base for exploring datasets, querying ideas, examining analytics for our applications, machine learning experiments, or for any other data-centric task. For general python-based application development and workflows, see the bottom section of this page on “Core Python.”

In more recent years, my time with python was naturally focused on pytorch, within which I have designed, assembled, and trained a variety of models in the multimodal and generative space. See machine learning. I additionally have used ray for parallelism and large-scale training, and have in the past adopted many techniques for manipulations of embeddings spaces, kNN search, and other vector-space data objectives.

Core Python: Data Manipulation & Databases in Production

example production project: credential mapping for reaccreditation, a critical data project for a large institution

I have written and deployed many microservices based on flask (usually under waitress, then proxied by nginx) for internal APIs to serve applications; many of these additionally used GraphQL on the endpoint. In other cases, Python takes on the task of mapping very large relational databases (Oracle at the university) into object structures, and then somtimes into a graph form or dumped into something like MongoDB for further use, often on a nightly schedule.

I have also contributed heavily to integrating Python-based data work and data science into workflows that rely on SAS, contributing to major open-source initiatives in this domain.