Identifying malicious web sites has become a major chal-
lenge in today's Internet. Previous work focused on detecting
if a web site is malicious by dynamically executing JavaScript
in instrumented environments or by rendering web sites in
client honeypots. Both techniques bear a signicant evaluation
overhead, since the analysis can take up to tens of seconds or
even minutes per sample.
In this paper, we introduce a novel, purely static analy-
sis approach, the -system, that (i) extracts change-related
features between two versions of the same website, (ii) uses
a machine-learning algorithm to derive a model of web site
changes, (iii) detects if a change was malicious or benign, (iv)
identies the underlying infection vector campaign based on
clustering, and (iv) generates an identifying signature.
We demonstrate the eectiveness of the -system by eval-
uating it on a dataset of over 26 million pairs of web sites by
running next to a web crawler for a period of four months. Over
this time span, the -system successfully identied previously
unknown infection campaigns. Including a campaign that
targeted installations of the Discuz!X Internet forum software
by injecting infection vectors into these forums and redirecting
forum readers to an installation of the Cool Exploit Kit.
Identifying malicious web sites has become a major chal-lenge in today's Internet. Previous work focused on detectingif a web site is malicious by dynamically executing JavaScriptin instrumented environments or by rendering web sites inclient honeypots. Both techniques bear a signi cant evaluationoverhead, since the analysis can take up to tens of seconds oreven minutes per sample.In this paper, we introduce a novel, purely static analy-sis approach, the -system, that (i) extracts change-relatedfeatures between two versions of the same website, (ii) usesa machine-learning algorithm to derive a model of web sitechanges, (iii) detects if a change was malicious or benign, (iv)identi es the underlying infection vector campaign based onclustering, and (iv) generates an identifying signature.We demonstrate the e ectiveness of the -system by eval-uating it on a dataset of over 26 million pairs of web sites byrunning next to a web crawler for a period of four months. Overthis time span, the -system successfully identi ed previouslyunknown infection campaigns. Including a campaign thattargeted installations of the Discuz!X Internet forum softwareby injecting infection vectors into these forums and redirectingforum readers to an installation of the Cool Exploit Kit.
การแปล กรุณารอสักครู่..