Acoustic surveillance has advantages over video surveillance in some special situations such as in darkness and in confined area where camera privacy is a public concern [1]. Previous approaches on developing acoustic monitoring system for automatic surveillance include cases such as gunshot detection system based on features derived from the timefrequency domain and Gaussian Mixture Model (GMM) classifier [2]. A two-stage approach is reported in [3] where the first stage is to classify sound events into typical and atypical cases and subsequent processing is done on atypical events such as gun shot, screaming and explosion. An audio surveillance system for typical office environment is reported in [4]. This system employs a background noise model to continuously update for event detection while both supervised and k-means data clustering are observed. In this paper, we look at sound surveillance as a problem to firstly classify sound events into human and non-human sounds, and apply different strategies to do subsequent processing of human screaming or non-human emergency sounds.