BOYD - How does BOYD work? - Cloud Archaeologist

BOYD uses local AWS command line session profiles to access CUR parquet files in S3. Minimal BOYD settings require a CUR bucket, the region where the bucket is deployed and an AWS command line profile that has access to the bucket.

Once configured, BOYD will identify the CUR report version (1.0, 2.0 or FOCUS) and the columns available in your dataset. There may be hundreds of columns and you will likely only need a fraction of them depending on your use case. Select the schema columns of interest and if you are not sure about a column’s contents, the S3 CUR Explorer allows you to browse summary statistics of columns directly from your reports in S3 to preview values.

CUR reports are typically massive in width and length: lots of columns and lots of rows. Schema selection helps reduce the width while the date columns (_end_date/_start_date/PeriodStart/PeriodEnd) will impact the length. For example, if you only need monthly reporting visibility and use other means for your daily trend visibility, only including bill_billing_period_start_date/BillPeriodStart and excluding the other _start_date/_end_date/PeriodStart/PeriodEnd columns, the size of the BOYD collected dataset will be reduced. Not surprisingly, including line_item_usage_start_date/ChargePeriodStart on an hourly CUR report will dramatically increase the size of your collected dataset.

With your schema selected, return to the home page, select a billing period and BOYD will collect the data from S3. From there you can start to explore your inventory, identify classification strategies to apply more context to your data, and explore the reporting and analysis capabilities.