Tech Architecting Next-Generation ML Inference: Optimizing Scalability, Cost, and Latency with Serverless Paradigms