AI & ML interests

The AI community building the future.

Recent Activity

Articles

Images for docs PR 3300

#625 opened about 4 hours ago by
hubnemo
evalstate 
updated a bucket about 13 hours ago
evalstate 
posted an update 3 days ago
view post
Post
3167
Hugging Face MCP Server v0.3.17
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

SEP-2640 "Skills Over MCP" support added (early access)
  • 2 replies
·
evijit 
posted an update 4 days ago
view post
Post
1007
Weekend mini project! Since commentary on AI is inherently interdisciplinary, we connected the observations in the Pope's encyclical with decades of scholarship in Responsible AI and Ethics research and created an interactive space with these annotations!

Work with @IJ-Reynolds , @yjernite , and @meg

Lots to unpack. We started with 105 annotations. Please submit pull requests for more that we may have missed!

society-ethics/annotated-encyclical
victor 
posted an update 10 days ago
view post
Post
813
Sharing how I built the LongCat-Video-Avatar 1.5 Space (+500k views on X) in one agent session. Gave a coding agent its own AI lab on ZeroGPU, framed the goal, walked away. It designed, deployed, tested against the live API, fixed, shipped.

Full recipe with the copy-paste prompt: https://huggingface.co/blog/victor/building-zerogpu-spaces-autonomously
  • 1 reply
·
alvarobartt 
posted an update 15 days ago
view post
Post
316
Open agents on AWS SageMaker AI with open models from the Hugging Face Hub!

> Deploy an open model from the Hugging Face Hub on SageMaker AI
> Connect the deployed model to Strands Agents
> Add built-in and custom tools for tool calling
> Expose external capabilities through MCP integration
> Bonus: talk to your agent and visualize traces with Gradio

https://alvarobartt.com/agents-on-aws-sagemaker
danieldk 
posted an update 17 days ago
view post
Post
198
Two large changes in kernel-builder this week:

kernel-builder now links libstdc++ dynamically. To support a wide range of systems, we build against libstdc++ from manylinux_2_28 (EL 8 and later).

Following our Torch support policy that the current and previous Torch versions are supported, Torch 2.10 support was removed. We will soon also support the Torch stable ABI, so that it is possible to write kernels that support a large number of Torch versions.
alvarobartt 
posted an update 18 days ago
view post
Post
3289
Latest hf-mem release added a breakdown of Mixture-of-Experts (MoE) memory usage!

TL; DR MoEs can be misleading to reason about from active parameters alone, since each token only activates a subset of experts, while the serving setup still needs to account for the full resident memory footprint.

🧠 hf-mem now splits MoE memory into base model weights, routed experts, and KV cache
🏗️ Dense models usually load and use most weights every forward pass, while MoEs load many experts but only route each token to a few of them
⚡ Active params isn't the same as memory footprint, especially for sparse architectures
📦 Runtime memory is about what is used per request/token, while loading memory also includes the expert weights that need to be resident
📚 KV cache can still dominate depending on context length, batch size, and concurrency
🔀 Expert Parallelism (EP) helps shard experts across accelerators when expert weights dominate
🚀 Data Parallelism (DP) + EP is often a good fit for throughput-oriented MoE serving

Check the repository at https://github.com/alvarobartt/hf-mem