Steam Degrees of Separation

The following content does not necessarily represent the opinion of my employer. All posts on this website are solely my opinion.

When I was a kid, I would sometimes click through Steam profiles, moving through each user’s highest-level friends until I eventually reached the highest-level user on Steam. What surprised me was that, almost 100% of the time, no matter who I started with, I would end up at that highest-level user. This was before I knew how to program, but even then, I imagined making some kind of tool to measure the degrees of separation between any Steam user and the Steam elite. In May 2021, I wrote that tool, and yesterday (January 17 2023), I improved it significantly.

Steam Degrees of Separation is essentially depth-first search with some additional logic to handle the quirks of the Steam user system. Depth-first search, or DFS, takes a simple approach to finding a goal within a graph: go as deep as possible along a given path until you reach a dead end or a loop. At that point, backtrack, and move on to the next unexplored node. Continue the search from there. DFS only needs a few simple rules, and, if implemented properly, can find a path quite quickly. One downside to DFS is that you have no guarantee that the path you find will be the shortest one; for this application, that’s acceptable.

My version of DFS works by examining the 5 (or 6) highest-level friends of a given user on steam. Always choosing the highest-level whenever possible, try to find a path to the 10 highest-level users on steam. If you encounter a private profile, or one with no unexplored friends, backtrack. If you encounter a loop, backtrack. If you’ve explored every user in this network, you can make the assumption that it is impossible to reach the top 10 by visitng only the highest-level friends of the starting user.

Here’s an example that involves a lot of loops:

Enter the URL for the steam profile you would like to check. URL must start with http.
For example: https://steamcommunity.com/profiles/76561197993787733.
URL: https://steamcommunity.com/id/ah_
Fetching the top 10 highest-level Steam users…
Done!
Searching…
Exploring aaron’s profile…
Exploring bluey.’s profile…
Exploring Squishy’s profile…
Exploring GarrettG’s profile…
Exploring Squishy’s profile…
We’re in a loop! Exiting the loop and moving down the list…
Exploring bluey.’s profile…
We’re in a loop! Exiting the loop and moving down the list…
Exploring Lunar’s profile…
Exploring GarrettG’s profile…
We’re in a loop! Exiting the loop and moving down the list…
Exploring LCT’s profile…
Exploring Chrissy’s profile…
Exploring Traetz’s profile…
Exploring Orlando Lima’s profile…
Found St4ck! Here’s the full path to their profile from aaron:
aaron -> bluey. -> Squishy -> GarrettG -> Lunar -> LCT -> Chrissy -> Traetz -> Orlando Lima -> St4ck.
aaron is 9 users away from the top 10.

In the above run, the delightfully named Squishy, bluey, and GarrettG were all each other’s highest-level friends, resulting in attempts to return to already-visited users. The algorithm found this loop, backtracked, and moved on, eventually reaching a member of the top 10 after a long journey.

As I said, I originally wrote this program in May 2021. At the time, I was taking CS-310 Basic Algorithms at NYU, and realized that my idea a decade ago to visit the highest-level friend of each user was actually just DFS. By using the Steam API (to be specific, an unofficial Python wrapper for the Steam web API), I was able to get the friends list for any user, as long as that user’s profile was public. The only downside was that the API was extremely slow; it took minutes to get a single user’s friends list. I had to fetch a list of all of each user’s friends; given that some of these users had thousands of friends, it took a while. There might have been a way around this, but I was unable to find one. I knew that I could use web scraping to get friends lists much more quickly, but having toyed with Beautiful Soup before, I had come to hate the whole library. I still think Beautiful Soup is clunky, but ultimately, it’s the best tool for the job. Once you have some experience, it becomes fairly intuitive.

Like I said, I didn’t want to use Beautiful Soup, so I let the project sit for almost 2 years, until I felt my interest returning. It ended up only taking a couple hours to figure out, and the new version of the application, contained in steamSepScrape.py, runs significantly faster.

I’m a bit rusty when it comes to big O notation and runtime analysis, but I believe both versions of the program run in O(n) time, where n is the number of users in a given component of the larger graph. In other words, for any isolated island of users separated from other islands of users, the longest the algorithm will take to run is equivalent to the number of users in that island. A worst-case scenario would be a great many users, each with 6 friends, arrayed in a formation in which it is necessary to visit every single one before realizing that there is no way to reach the top 10. I have thus far been unable to find such a component on Steam, but it could theoretically exist. Far more likely is a much smaller component, with just a few users, all of whom are only friends with each other.

For a rudimentary proof of the algorithm’s runtime, consider the following simplistic, informal, and possibly incorrect proof by counterexample:

To prove that it is impossible for the algorithm to have a runtime slower than O(n), let’s assume for the sake of argument that there exists some component, C, for which the runtime of the algorithm is greater than O(|C|), where |C| is the number of users in C; that |C| > 0; and that the algorithm is trying to find a path between users a and b. If O(C) > O(|C|), then the path between a and b must contain some number of users, u, greater than |C|; however, a path of such size would inherently contain a number of users such that u > |C|. A path of length greater than |C| would require some user to be included twice, which is impossible, as the algorithm exits out of any and all loops, and does not investigate users it has already investigated. Therefore, this is a contradiction, and the runtime of the algorithm must be O(n).

I don’t have a formal proof of any of the above, nor will I ever write such a proof (and I really will stick to that plan). I might tinker with the algorithm a bit, but at this point, I think I can call it complete. It was an interesting, informative journey. If you want to try the program out for yourself, the code is available here.

Posted January 18, 2023