Abstract: Community researchers have developed various advanced audio-visual segmentation (AVS) models to accurately segment sound-producing objects. However, existing methods face two key limitations ...