The reason the problem of scaling in real-time is so tough because it is extremely complex- consisting of an array of dynamic components, the failure in any of which can bring the entire system crashing down.
Be it a UI component that’s consuming too many resources, a database component that’s overwhelmed by requests or simply a server component that fails to secure additional infrastructure to serve skewed traffic.
And while there can never be an air-tight solution, having a clear understanding of nuances would certainly help you make informed choices:
One of the common misconceptions is that scaling is all about additional infrastructure and while that’s correct to some degree (mostly for self-hosted apps), it is actually software that causes a majority of the issues.
Each of those components has a threshold- something that must be taken into the equation while designing your app.
Inversely, caches have high memory cost but lower computational cost.