Writes BytebyteGo newsletter -


2024-08-13 - Counting billions of content usage at Canva

  • https://blog.bytebytego.com/p/counting-billions-of-content-usage
  • Original design - Used a MySQL database. Had separate worker services for data collection, deduplication, and aggregation. This faced scalability issues.
    • I was wondering why they needed deduplication and why that had become a big deal. Deduplication hints that the same event was being logged many times. Probably they have some rule to count multiple views as a single view. Say if I’m using an image that I bought. I might be changing things on my design and the page might reload a few times. You can’t count all the reloads as different usages, probably.
  • Migrated to DynamoDB since it can handle things at scale. Didn’t complete this.
  • Finally found success by migrating to an OLAP system using Snowflake.
  • They say they reduced the latency from over a day to under an hour. Wow. Seems like a Hadoop kind of a system.