Our paper in the AI4Sys '24 conference at HPDC 2024 presents a novel technique using Generative AI (GenAI) to automate on-the-fly customizations of AI/ML solutions. The ECO-LLM system dynamically adjusts task placement between edge and cloud computing, resulting in minimal performance differences while significantly reducing manual effort and time in solving systems problems.
Tag: runtime
CLAP: Cost and Latency-Aware Placement of Microservices on the Computing Continuum
Our paper presents CLAP, a dynamic solution for optimizing microservice placement across edge and cloud computing in real-time applications. It addresses workload-induced latency issues and cost efficiency by utilizing Reinforcement Learning. Experiments on video analytics demonstrate significant cost reductions of 47% and 58% while maintaining acceptable latency levels.
ECO: Edge-Cloud Optimization of 5G applications
In The 21st IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGrid 2021), Melbourne, Victoria, Australia Centralized cloud computing with 100+ milliseconds network latencies cannot meet the tens of milliseconds to sub- millisecond response times required for emerging 5G applications like autonomous driving, smart manufacturing, tactile internet, and augmented or virtual reality. We describe … Continue reading ECO: Edge-Cloud Optimization of 5G applications
