Projects & Blog
Tip
You can make images larger by clicking on them
Data Science, MLOps, and Cloud Engineering¶
Taking Python to Production - Udemy Course¶
Highly-rated Udemy course covering the fundamentals of software engineering.
Many data scientists and junior engineers come from an academic background in statistics or other quantitative field, without having learned sufficient software engineering skills to bring their ideas to production or collaborate effectively.
This course has helped hundreds of students become effective citizens of the software engineering community. These skills have elevated the self-sufficiency of my co-workers and laid a foundation for us to do advanced MLOps.
Soon-to-be-Official VS Code Extension for ClearML¶
Remote workstations are amazing! Especially for data science.
From your laptop, you can SSH into a more powerful machine to get tremendous gains in productivity.
Here is a Linkedin post of mine listing several benefits and comparing different remote workstation offerings.
Metaflow by Outerbounds is an open-source MLOps tool with a closed-source VS Code extension that allows you to connect to remote workstations.
I organized a hackathon that created a clone of Outerbounds' VS Code extension for another tool called ClearML, allowing anyone to host their own on-prem or cloud DS workstations without the need for Kubernetes and connect in one click.
🎉 ClearML offered to officially adopt the extension and take over maintenance. They also complimented our clean TypeScript code.
Elevated Analytics - CRM Analytics Startup¶
What: We offer analytics via SaaS for Salesforce and other CRMs as well as mentorship and training on best practices for tracking an organization's sales process.
This is ideal for groups who have not hired a full-time analyst but want affordable, high-quality visibility into their sales funnel.
Who: I started this company in 2022 together with Ryan Gardner who I met through Rootski, and two friends Ryan had worked with, Nate Roberts and Colin Toyn.
These are great co-founders. I built out data flows and a warehouse to ingest CRM data from customers, Ryan built the BI layer on top of that, and Nate and Colin have been networking to find clients. Nate, Ryan, and Colin are full-time sales analysts, so they are the subject matter experts.
Self-hostable Minecraft Server Platform-as-a-Service¶
Over the course of December 2022, I rallied several strangers and friends around the cause of saving Christmas 🎄 for the Riddoch family cousins. 7 software engineers pulled several super late nights.
We produced a high-quality, self-hostable, serverless, secure, nearly free-to-run Minecraft server platform-as-a-service (PaaS) that anyone can deploy into their own AWS account using AWS CDK.
🎉 This project pushes the boundary of what you can do with AWS CDK. Some of the AWS CDK core developers were really impressed!
pip install awscdk-minecraft
to get started!
Organized successful worldwide AI/data hackathon¶
I co-organize the Utah chapter of the MLOps.community Meetup. Through this Meetup, I organized the most successful hackathon we've ever had!
- 60 in-person attendees, 20 remote
- 2 sponsors (BENlabs and Nerd United)
- $2,500 in prizes
- 9 90-second project demo videos submitted (see video)
Platform Engineering: Create/deploy a product in minutes¶
This is an open-source POC I did before implementing this at work.
What if, in under 5 minutes, you could create a new repository with permissions, CI/CD, build secrets, and a boilerplate app that deploys to the cloud in one-click?
Seriously, this is the pinnacle of platform engineering!
www.rootski.io Deep learning SaaS for studying Russian¶
v1.0.0 - Containers and SQL
#Terraform #DockerSwarm #Postgres #Bamboo Server #Bitbucket Pipelines #Traefik
v2.0.0 - Serverless and NoSQL
AWS CDK AWS Lambda DynamoDB Github Actions AWS API Gateway AWS Cognito / OAuth 2.0 Sphinx
Russian words are often looong, but there are only ~300 word roots which make up the most common words.
You can break up the word саморазмораживающийся (self-defrosting) like this:
само (self-) раз (un-) мораж (frost) ивающийся (-ing)
Rootski uses a deep learning transformer model to break russian words into roots. If a breakdown is obviously wrong, users can submit their own. The GIF to the left does the breakdown for "выходить" which means "to exit".I worked on Rootski for four years.
I mentored more than 20 people from all over the world in exchange for help building it out.
I ran Rootski like a startup. I
- Recorded a 10-hour YouTube playlist for onboarding junior developers to the codebase and tools.
- Deployed a knowledge base generated with an advanced Sphinx setup.
- Gave total strangers access to my AWS account paid for with my credit card and trusted my IAM roles to keep costs from exploding.
- Automated the creation of ClickUp tickets for new contributors which guided them through onboarding in a structured, self-serve way.
- Recruited contributors through Linkedin with posts like this one and made a lot of friends in the process.
In the end, I finally acknowledged that I had learned a lot from building Rootski, but there were other projects I wanted to work on more.
Rootski is a fantastic reference project for anyone who wants to learn to build a scalable, secure, stable SaaS product at low cost with modern tools.
Top Contributors
Eric Riddoch 🧑🏫 💻 |
Isaac Robbins 💻 🚇 |
Josh Abrahamsen 🚇 |
Ryan Gardner 💼 |
Joe Drapeau 💻 |
Ethan Walker 💻 |
Isaac Z Tai 👀 |
Adam Lenning ️️️️♿️ |
Rootski mobile app (abandoned )¶
My original plan for Rootski was to make it an offline-first, cross-platform mobile app. 2 things caused me to rewrite it for the web:
- After surveying many potential users, I found that most would prefer to use it on their computers.
- While I was able to export the PyTorch model with ONNX, I would have had to write native Java/Kotlin and Swift code to run inferences on iOS and Android.
It was sad to leave this project behind, but I learned React in the process, which has been super valuable.
MLOps & Observability: First to ever send BentoML logs, metrics, & traces to NewRelic¶
BentoML is a cutting edge tool for high-performance model serving.
I did extensive testing and struggled to send BentoML metrics and traces to NewRelic.
I ultimately succeeded in creating an experiment (shown in screenshot; link to code below). I created FastAPI app instrumented with NewRelic that hit a BentoML API instrumented with OpenTelemetry which then hit another FastAPI app, again with NR instrumentation. I used the AWS OTel Collector to send the BentoML traces to NewRelic.
🎉 It worked! My example shows that NewRelic and OpenTelemetry traces are fully compatible and that BentoML can be monitored with no code changes!
🎉 I also submitted a issue which BentoML implemented, making it so failed HTTP requests return a trace ID in the headers. This makes it possible to look up the logs of failed requests to find the root cause!
GitHubBook about Linux and the command line¶
A lot of my friends in data science feel insecure about their software development skills because they didn't study "Software Engineering" or "Computer Science" in school.
I have mentored and pair-programmed with many people to help them learn Linux, git, OOP, test-driven(ish) design, and other key software skills. After doing this, I've decided to put it all on one place by writing a book that assumes nothing but knowledge of Python syntax.
Update: I've stopped writing the book, and am making a Udemy course instead. The chapters that are here turned out nicely!
Eric the Vast - Fitness blog with data!¶
Behold the latest in fitness tracking technology™️
I read a book called Atomic Habits which helped me realize that the reason I often fail to reach my long term fitness goals is that skipping workouts is not painful in the short term.
So, I wrote up a fitness contract stating that if I ever miss a workout, I must pay my family and friends $750!
I blog my progress monthly and embed interactive dashboards populated by my workout log in Google Sheets, so my friends have total visibility into everything I eat and all the workouts I do.
Docker for data scientists - meetup talk¶
Deploying code to production can be as easy as running it on your own machine.
In this presentation, I give a very visual introduction on how to use docker to wrap apps into tidy containers so that they can run seamlessly in any environment.
Thompson sampling with cats!¶
A fun demonstration of how Thompson Sampling can be used to achieve better results than A/B Testing when deciding which version of a product to deploy.
GitHub See the Cats!Spotify analysis with linear regression¶
An analysis of several thousand songs on Spotify. It examines which features of a song are most correlated with popularity.
I check the LINE assumptions, run LASSO regression to eliminate noisy features, and then do a brute force model selection to find the features most predictive of popularity.
AnalysisDominating a writing assignment with Python¶
For a writing class group assignment, we were pretending to be consultants for Pluralsight's social media strategy. Our paper was the usual fluff, except for THIS.
I scraped the company Facebook pages of a few different e-learning platforms and showed that Udemy (yellow line) had way more engaged followers than Pluralsight (blue line). We blew our professor's mind and we all got A's 🤣.
Cringy PaperAdvanced Topics in Applied Mathematics¶
Facial recognition¶
Algorithm that uses eigenfaces to solve the computer vision problem of face recognition.
Given an image of a face, this algorithm finds the closest match in the data set with respect to the Euclidean norm.
Fourier transform¶
Implementation, explanation, and several applied examples of the famous Fast Fourier Transform.
This algorithm decomposes a signal via a linear transformation from the time domain to the frequency domain in O(nlogn) time.
I have a rigorous understanding of the mathematics behind this process.
Classical machine learning problem¶
Implementation of a K-Dimensional Tree classifier used to correctly identify the digits 0-9 from thousands of images.
Page rank algorithm¶
The PageRank algorithm was principle to Google's success. Before PageRank, search results on the web were totally disorganized.
I implemented the algorithm myself and used it to rank websites from a dataset maintained by Stanford University. Then, I imported the optimized NetworkX version to build a March Madness bracket.
Markov chains¶
Yoda Speak
A Jedi's strength flows from this kind?
To question, no try.
A Jedi Knight with you be.
The Chosen One the Chancellor, he will be.
Encircle them we must, then divide.
I created a Markov chain from Master Yoda's speech to simulate a classical Natural Language Processing problem.
Because the outcome of a Markov chain only depends on a single state (a single word in this case), the results are non-sensical and fun to read.
Software Development¶
Family map on Android¶
Custom DOMO tiles - data pipeline tool¶
Your company may amass information from B2B services such as Facebook Advertising, LinkedIn, Salesforce, etc. DOMO provides an all-in-one data warehousing and analytics platform to handle this data.
This script can download one or more datasets from a DOMO instance, run the data through any R or Python script, and then push it back into DOMO. When scheduled, this serves as an easy data pipeline.
Algorithms and Data Structures¶
Pacman breadth-first search - Hackerrank¶
100+ Hackerrank challenges¶
Kevin Bacon BFS¶
In [1]: graph = bfs.MovieGraph()
In [2]: graph.path_to_actor("Robert
Downey Jr."
,
"Kevin Bacon"
)
Out [2]:
['Robert Downey Jr.',
'Avengers: Infinity War (2018)',
'Josh Brolin',
'Hollow Man (2000)',
'Kevin Bacon']
Binary search / AVL Tree¶
Web Development¶
This very website ¶
This website has come a long way over the years. If you'd like to learn how to make your own, reach out to me! If you're willing to develop yourself and learn what you need, I'm willing to show you my tooling.
I want to thank my professional hero, Andrew Carr, for inspiring me to do projects like this and show them to the world. I've been riding his wave of pure-genius career ideas for years and it's high time I put his name on this website. He's blessed my life so much and deserves every bit of his massive success .
Dropbot - Chrome automator¶
Writing bots and web scrapers is useful, but can be a pain. Tools such as the Selenium library rely on developers to cleverly locate the important HTML elements of a page.
The Dropbot Chrome extension makes this easy, and even lets you export custom bot scripts as JSON objects for use in any programming language.
Vision therapy Tetris¶
I have a lazy eye. I took a 3 day challenge to learn Javascript and made this vision therapy game for kids with the same condition.
Wearing an eye patch is a good way to exercise a non-dominant eye, but it doesn't train our two eyes to work together. With red and blue 3D glasses, only one eye can see the red blocks while the other sees the blue blocks--so the eyes must coordinate. This is a fun way for kids to do their vision therapy exercises, and train their eyes to work together.