Mooncake: A KVCache-centric Disaggregated Architecture for LLM Serving

(github.com)

8 points | by sarkory 2 days ago

1 comments