What is Gumshoe?

Gumshoe and it’s alter-ego Hawkshaw were two different products essentially doing the same things but from opposite angles.

Gumshoe was focused on employee retention and Hawkshaw was focused on recruiting. The system I built was able to non-intrusively watch employees’ behaviours online, and tell you within 24 hours if one of them was exhibiting signs that they were looking for a new job. I was working on different ways to sell this knowledge and it seemed to make sense to try and sell to both parties as employee retention can be very expensive and head hunters also charge a pretty penny.

Retention clients would receive a monthly PDF based report to their email with a detailed analysis of each employee that made signals, what their risk levels were and an explanation. If anything came up urgently, a notice would be emailed immediately as well. We would keep track of when new employees joined and track everything on our end for complete privacy and deniability by our clients.

For recruiting clients I would email them daily with a list of 5-30 people who would surface that were making waves and would be very amenable to a happenstance phone call or email asking if they were looking for something new. The only problem is that much like fishing, you don’t know what’s going to be on the end of the line until you reel it up.

Why Gumshoe?

The genesis of this idea was me interviewing people who were running businesses and asking them what their biggest challenges were. One person told me that employee retention and hiring is hard. If he could only spot people before they leave he could get ahead of them and find ways to make them stay. I didn’t know if it could technically be done, but it was a good challenge and I went full force into it. That being said, the HR industry is tough because the amount of money they expect to pay per customer is pretty low and both the cost to run such a service and it’s actual value returned were much, much higher.

I spent nearly a year of my time and all of my consulting savings working on this venture, upwards of $50k. Much of that was on labour, but a good chunk was on expensive proxy services.

Technology Used?

This entire system was built on the following technologies:

  • ASP.NET MVC
  • MS SQL
  • Knockout JS
  • Hangfire
  • AngleSharp
  • Azure
  • Proxy Services
  • Web Scrapers
  • Various APIs (Github & Twitter)
  • Custom Chrome Plugins

This system was very process intensive and time consuming to run, not to mention the scraping required a lot of proxy magic. I heavily utilized Azure Web Jobs to scrape and process each piece of information.

Achievements and Learnings

Gumshoe was a very complex and challenging project. I learned a ton about scraping, proxying, beating the system, scaling, and more. I learned Knockout JS, which was a nice level up, I pushed the limits of LINQ and hit super weird errors that probably almost no one has ever seen like “Contact your system administrator” when you try to push it too hard.

I really upped my C# and generics game, spent a lot of time organizing and refactoring and documenting code, and worked on getting the Azure Service Bus, Azure Web Jobs, and Azure SQL to play nicely. I hit a lot of transient database issues because I was pushing too much data too fast for the already expensive levels of Azure SQL I was running.

I spent many months fighting and defeating LinkedIn and their anti-scraping measures, making fixes whenever anything was changed, and learning that scraping code was quite ephemeral and risky at best and turbulent and frustrating at worst.

I built an internal timeline analysis dashboard for reviewing any people that popped up on my radar and spent a lot of time on the UX so I could quickly tell if it was a false positive or not and mark it as so; it was optimize to my vertical monitor. I had to figure out a lot of time based scoring algorithms to make the detection portion of this work. I also had to remake the LinkedIn profile database from the outside looking in.

During this project I also learned and created a custom chrome extension that allowed me to quickly import people into my watch list, allowing me to import people from every company I could find with ease.

I learned about interesting things LinkedIn does for you and doesn’t tell you about such as correcting minor grammar and style errors in your text, turning double spaces after periods to single spaces, etc.

During the process of building, testing and running Gumshoe I manually reviewed over 8000 daily signals that my algorithm would bubble up out of the 1,793,484 signals that were processed. By the time I stopped my database was 162 GB.

Many of the strong signals I detected I made predictions with and followed up on later to see that the people did indeed move on and in several cases were people I knew, so I followed up with them before they moved on and found out that they were in fact looking. The system definitely worked great, you just had to cast a wide net and watch every day.

I uncovered several companies that went bankrupt months before they announced it and one acquisition. I could also detect people who got fired and tried to hide it.

When Was This Made?

Gumshoe was worked on between mid 2016 to mid 2017.