
I changed jobs close to seven years ago with a personal objective to learn Python. Having battled “data challenges” during every facet of my career, I couldn’t see myself keeping up without the ability to automate collection and analysis of data. My new role was primarily security and compliance related, but a subset of my responsibilities was curating AWS Cost and Usage Report data.
I first encountered the “CUR” early on at the job I was leaving. I was one of three hired to help develop and launch a security product and was centrally involved in the cloud architecture and operations. A beta version of the product was online within 6 months and requests for visibility into the costs immediately followed. This was in 2013.
I battled with the detailed billing CSVs but it wasn’t long before Cloudability was purchased. While it helped save me from the data wrangling and reporting challenges, it also highlighted the challenges in the data. Tag gaps and making sense of reserved instance amortization were initially the biggest challenges but J.R. Storment and team were engaged and developing features for our needs. Things seemed alright.
It wasn’t immediate, but it didn’t take long before the cost of understanding our costs was too costly for the business. At the same time, I was looking to shift away from the 24/7 life of product operations and ended up shifting into business operations where I focused on AWS account management and cost data reporting and analysis. Splunk was the tool of choice, as a co-worker had already done some magic for our product with it. I had published a Splunk app for analyzing Security Onion data years prior, but hadn’t worked with it since. It wasn’t long before we were reporting in-house and doing magical things with Splunk. Then the data grew…and grew…and grew.
Between consolidating AWS accounts and new cloud initiatives, it wasn’t going to be long before Splunk went the way of Cloudability. When the writing was on the wall, I didn’t want to stick around to see that happen and wondered if there were better/other ways of tackling the problem. I had also failed a personal goal to learn Python three years running that I was determined to change.
My current role offered me the immediate chance to find out. Before it was a button click (or three) in the UI, I implemented Athena for querying CUR data, built a Python pandas based data pipeline for processing rules to fill tagging gaps, perform context lookups, allocate reserved instance and savings plan costs, and later leveraged QuickSight for reporting and enabling other teams access.
Having grown competent with Python and beaten the CUR with my bare hands, I’ve spent most of my personal time the last four years prototyping ways to collect, classify and analyze CUR data. Several approaches weren’t evolved enough to handle the task with performance and efficiency. Then came Python 3.11 and its impressive performance improvements. Then came polars, and suddenly the effort seemed viable.
BOYD is Cloud Archaeologist’s first release. I build stand-alone applications because I have no intent of adding to the inefficiency and waste of *aaS. When the answer to cloud questions is more cloud, we’re just feeding the beast. All the while, we’re barely using the powerful devices under our fingertips in favor of more cloud, more clickOps through UI accounts and regions, and more free promotional opportunities to consume more. If you are in a small organization and don’t have teams of people needing constant access available, why would you need a third party to connect to your account, copy and process your data, and go into idle waiting to serve?
If your goal is to understand AWS Cost and Usage Report data or your assets in AWS and gain immediate context about the resources buried (for far longer than you might expect) beneath millions of rows of data, BOYD might be a useful tool for you. BOYD allows users to classify resources (i.e., fill tagging gaps) and derive context about cost-incurring resources from AWS service APIs. While it can fill the need of a cost reporting tool, it’s value extends into the security and asset management domains.
Most of my personal time has gone into this project the last three years and BOYD has reached a point where it can be of benefit to others. I hope you find it useful and less tedious than many of the options available. And don’t let the low cost fool you. BOYD is certainly least costly than just about any other option available and for a reason. You are paying for my development time and support if you experience issues. Not for a team of platform engineers, SaaS integrators, and all the complexity “cloud” delivers. My objective is to reduce that complexity. BOYD is a simple shovel, but sometimes a shovel is exactly what you need.