NeverSight/skills_feed/awq-quantization
Activation-aware weight quantization for 4-bit LLM compression with 3x speedup and minimal accuracy loss. Use when deploying large models (7B-70B) on limited GPU memory, when you need faster inference than GPTQ with better accuracy preservation, or for instruction-tuned and multimodal models. MLSys 2024 Best Paper Award winner.
Risk Score
50
out of 100
Popularity
10
Stars
2
Forks
Feb 12, 2026
Updated
Findings by Severity (Latest Scan)
CodeThreat AppSec
Full SAST + SCA agentic security analysis for MCP servers and Skills.