Automatic Abstract Service Generation from Web Service Communities
Xumin Liu
Department of Computer Science
Rochester Institute of Technology
xl@cs.rit.edu
Hua Liu
Xerox Research Center
Webster, NY
hua.liu@xerox.com
Abstract—The concept of abstract services has been widely
adopted in service computing to specify the functionality of
certain types of Web services. It significantly benefits key
service management tasks, such as service discovery and
composition, as these tasks can be first applied to a small
number of abstract services and then mapped to the large scale
actual services. However, how to generate abstract services is
non-trivial. Current approaches either assume the existence
of abstract services or adopt a manual process that demands
intensive human intervention. We propose a novel approach to
fully automate the generation of abstract services from a service
community that consists of a set of functionally similar services.
A set of candidate outputs are first discovered based on prede-
fined support ratio, which determines the minimum number of
services that produce the outputs. Then, the matching inputs
are identified to form the abstract services. We propose a set
of heuristics to effectively prune a large number of candidate
abstract services. An comprehensive experimental study on
real world web service data is conducted to demonstrate the
effectiveness and efficiency of the proposed approach.
Keywords-web service; abstract web service; web service
community;
I. INTRODUCTION
Service-Oriented Architecture (SOA) provides a flexible
platform to address interoperability issues in a large scale
and heterogeneous environment [16]. Web services, regarded
as the most promising backbone technology that enables the
modeling and deployment of SOA, have enjoyed great attention
and wide adoption from both academia and industry.
This results in the large and ever-increasing number of Web
services available on the Web, which, on the other side,
introduces significant difficulties of discovering desirable
Web services and composing services to provide a valueadded
service. The current standard service discovery theme
relies on UDDI, which has suffered from its two major
limitations: keyword-based discovery theme and relatively
high expectation on the input from service providers [14].
The concept of an abstract service has been proposed and
widely adopted in the literature to allow service discovery
and composition in a top-down fashion [6]. An abstract service
specifies the functionality of a certain type of services,
e.g., weather or map services. It can be mapped to the group
of actual services that provide the specified functionality.
Finding a desired service can start at identifying the matched
abstract service and then search in the corresponding service
group. Hence, both the efficiency and accuracy of service
discovery are expected to be improved through narrowing
down the searching space. Similar to service discovery,
service composition can start at designing the composition
schema by identifying suitable abstract services and building
up a schema-level workflow. The schema will then be instantiated
by finding actual services for the abstract services
and orchestrate them [4]. Following the same rationale,
abstract services also facilitate the process of dealing with
the frequent changes in service oriented enterprises [3], [9].
While the usage of abstract services holds tremendous
promise, how to generate abstract services poses a set of
key challenges. Existing approaches usually adopt a manual
process to create abstract services and generate the mapping
to the concrete services. The process starts by designing an
abstract service based on the designer’s view of the service
space and user query requirements. To have a complete and
comprehensive view of the service space, the designer needs
to manually go through all service descriptions, which is
time consuming and error prone. This is simply infeasible
for a large number of web services. Moreover, the designer
will also need to manually specify the mapping between
an abstract service and the corresponding concrete services.
An alternative way is to ask service providers to link their
services to abstract services when publishing services. This
is, however, impractical considering the autonomous and
independent nature of service providers.
Inspired by existing Web service community learning approaches,
automating the process of generating abstract services
and their mapping to concrete services can be achieved
by following two steps: (1) generates a functionality-based
service organization to group together services providing
similar functionalities, i.e., forming service communities; (2)
extracts common functional features of services within a service
community to describe abstract services. Existing web
service community learning approaches can be leveraged
to bootstrap service communities [17], [7], [8], [13], [5].
These approaches compute the similarity between WSDL
descriptions by leveraging information retrieval techniques
(e.g. TF/IDF) to model each web service as a vector of
terms. They then apply data clustering algorithms to generate
service communities based on the service similarity. These
works differ mostly in the constructions of the term vectors,
2012 IEEE 19th International Conference on Web Services
978-0-7695-4752-7/12 $26.00 © 2012 IEEE
DOI 10.1109/ICWS.2012.41
154
calculations of the similarity metrics, and clustering algorithms
(e.g., QT, k-means, SVD, and SS-BVD). However,
these work only generate the mapping between a service
community and its member services and lack sufficient summative
description of functionality of the member services,
i.e., abstract services. Simply using cluster centroids or
representative terms to label a service community is far away
from being sufficient. First, such labels cannot precisely capture
the functionality of all member services. Users still need
to go through a service’s description to determine whether
the service provides the desired functionality. Second, it is
not guaranteed that the labels have high coverage of member
services’ functionality.
To address these issues, we propose an automatic abstract
service generation process to extract functional features
of a service community’s member services. All possible
abstract services will be generated to ensure maximum
coverage. The number of concrete services that instantiate
an abstract service is no less than a predefined threshold in
order to guarantee precision. In the remaining parts of the
paper, we use abstract service and functional label in an
exchangeable manner. We leverage association rule mining
techniques to efficiently generate and prune the candidate
abstract services [2]. We start with finding possible output
of an abstract service by checking whether there are enough
number of services generating the output. For each output
as such, we enumerate all possible input and check whether
there are enough number services from the result of the first
step that consume the given input. The mapping between
an abstract service and the member services are generated
during the process. We apply a set of heuristics to improve
the efficiency and scalability of the process.
The remainder of this paper is organized as follows. In
Section II, we formally define an abstract service and its
support ratio. We then present the abstract service generation
problem that we address. In Section III, we propose
a process of generating abstract services from a service
community. Possible functional labels are enumerated in a
heuristic way. We use a bitmap to efficiently check each
label’s support ratio. In Section IV, we present a comprehensive
experimental study to demonstrate the effectiveness
and performance of the proposed algorithms. In Section V,
we discuss some representative related works. In Section VI,
we conclude our paper and discuss future work.
II. PROBLEM STATEMENT
A. Abstract Service and support ratio
An abstract service should be describe in terms of the
functional capacity of member services of a service community,
i.e., the ability of transducing a set of input to a
set of output. [1]. A query requirement specified in terms
of abstract services will be mapped to concrete services if
there is a match. A query requirement is usually formatted as
looking for a service that takes a given input and generates
Table I
MEMBER SERVICES IN WEATHER COMMUNITY
ID Input Output
s1 city, state, country weather, gas_price
s2 zipcode weather, gas_price
s3 city, state, country weather, map_url
s4 geocode map_url, gas_station
s5 zipcode weather
a given output. An example would be: find a service that
takes an address as input and returns weather information.
Along with this line, we define an abstract service as:
Definition II.1 An abstract service (or functional label)
is a binary l = {l.I, l.O}, where l.I = {i1, i2, ..., im} is its
input, and l.O = {o1, o2, ..., on} is its output.
Through an abstract service l of a service community c,
users can understand what type of queries can be satisfied
by c’s member services. That is, being provided with the
data items in l.I, the instances of l can generate all the data
items in l.O. To measure the portion of services in c that can
satisfy the query, we first define the support of a concrete
service as follows.
Definition II.2 A concrete service s is said to support an
input I, denoted as s.I ˆ (I), if s.I ⊆ I; s is said to support
an output O, denoted as s.Oˆ (O), if s.O ⊇ O; s is said to
support an abstract service l, denoted as sˆ(l), if s.I ˆ (l.I) and
s.Oˆ (l.O).
Based on Definition II.2, we compute the support ratio
of an abstract service (or functional label) as follows. Let
S = {s1, s2, ..., st} be the set of all member services in a
service community. The support ratio σ(l) is calculated as
follows:
σ(l) = |{si|si ∈ S ∧ sˆ(l)}|
|S| (1)
Example II.1 Suppose the weather com