It's not hard to build an analytics pipeline if you pay attention. Ours is basically free due to how our fastly account is structured.
The flow is:
https request to fastly
Synthetic request that logs to fastlylogs
Ship logfile to s3
Process with lambda
This is basically freeish except for the RDS instance.
I agree it’s easy to build a pipeline, but one with validation and the myriad of options that Snowplow provides is not trivial to build.
I see the data model, validation and tracking libraries for any platform you can imagine the hard part.
That’s why we initially went with Snowplow in the first place.
But at some point when your needs are stable you can recreate everything and save money.
The problem is that you get so locked in that you get stuck with a massive bill...and then you're hosed. Especially if there's a marketing engine behind it, like segment.
We ran into similar issues with SAAS log aggregation a few years ago. It's scarily easy to get into a "this is just what service X costs" mindset with an FTE's worth of bills going out every month.
That’s a great example. The cost of Datadog is a meme these days. In analytics, Segment started as a solution ‘for the people’ and quickly became very expensive, spawning multiple efforts to replace it, only for those efforts being commercialised and becoming too expensive :)