The innovation in ICT can provide integrated information intelligence for better urban management and governance, sustainable socioeconomic growth and policy development using participatory processes.
By Zaheer Khan, Ashiq
Anjum, Kamran Soomro and Muhammad Atif
Approximately 50% of world’s population live
in urban areas, a number which is expected to increase to nearly 60% by 2030. High levels of urbanization are even more evident in Europe where today
over 70% of Europeans live in urban areas, with projections that this will
increase to nearly 80% by 2030. A continuous increase in urban
population strains the limited resources of a city, affects its resilience to
the increasing demands on resources and urban governance faces ever increasing
challenges.
Furthermore, sustainable urban development, economic growth and
management of natural resources such as energy and water require better
planning and collaborative decision making at the local level. In this regard,
the innovation in ICT can provide integrated information intelligence for
better urban management and governance, sustainable socioeconomic growth and
policy development using participatory processes.
Smart cities use a variety of ICT solutions to deal with real life urban challenges. Some of
these challenges include environmental sustainability, socioeconomic
innovation, participatory governance, better public services, planning and
collaborative decision-making. In addition to creating a sustainable futuristic
smart infrastructure, overcoming these challenges can empower the citizens in
terms of having a personal stake in the wellbeing and betterment of their civic
life. Consequently, city administrations can get new information and knowledge
that is hidden in large-scale data to provide better urban governance and
management by applying these ICT solutions. Such ICT enabled solutions thus
enable efficient transport planning, better water management, improved waste
management, new energy efficiency strategies, new constructions and structural
methods for health of buildings and effective environment and risk management
policies for the citizens. Moreover, other important aspects of the urban life
such as public security, air quality and pollution, public health, urban sprawl
and bio-diversity loss can also benefit from these ICT solutions. ICT as prime
enabler for smart cities transforms application specific data into useful
information and knowledge that can help in city planning and decision-making.
From the ICT perspective, the possibility of realization of smart cities is
being enabled by smarter hardware and software e.g. Internet of Things (IoTs) i.e. Radio Frequency Identification (RFIDs), smart
phones, sensor nets, smart household appliances, and capacity to manage and
process large scale data using cloud computing without compromising data
security and citizens privacy. With the passage of time, the volume of
data generated from these IoTs is bound to increase exponentially and
classified as Big data. In addition, cities already possess land use,
transport, census and environmental monitoring data which is collected from
various local, often not interconnected sources and used by application
specific systems but is rarely used as collective source of information (i.e.
system of systems) for urban governance and planning decisions. Many local
governments are making such data available for public use as "open data". Managing such large amount of data and analyzing for various applications e.g.
future city models, visualization, simulations, provision of quality public
services and information to citizens and decision making becomes challenging without
developing and applying appropriate tools and techniques.
In the above
context, recent emergence of Cloud computing promises solutions to such
challenges by facilitating big data storage and delivering the capacity to
process, visualize and analyze city data for information and knowledge
generation. Such a solution can also facilitate the decision makers in meeting
the QoS requirements by providing an integrated information processing and
analytic infrastructure for variety of smart cities applications to support
decision-making for urban governance.
Figure 1 depicts
our view of the main thematic pillars of smart cities: smart people, smart
economy, smart environment, smart governance and smart mobility which
contribute towards the sustainability of resources and resilience against
increasing urban demands. The main motive towards developing such a view is to
consider a holistic approach for smart cities by providing data acquisition,
integration, processing and analysis mechanisms to synthesize the needed
information that can help in enhancing resilience and sustainability of a city.
Managing data for these thematic domains in a Cloud environment provides the
opportunity to integrate data acquired from various sources, process and analyze it in acceptable time frames. However, it is not straightforward to
adopt cloud computing to deal with smart city applications due to a number of
challenges and requirements [20]. Our aim here is to discuss a perspective on
how these challenges can be addressed in part by using ICT tools and software
services to intelligently manage and analyze the complex big data of smart
cities, by incorporating a suitable Cloud architecture.
An Abstract
Architectural Design of the Cloud-based Big Data Analysis
This section of
the article discusses the development of a cloud service for smart city related
big data analysis. Here, we describe the design and implementation of a
generic Cloud based Analytics Service.
The system
architecture, as shown in Figure 2, is divided into three tiers to enable the development
of a unified knowledge base. Each layer represents the potential functionality
that we need to meet the overall research objectives. The lowest layer in the
architecture consists of distributed and heterogeneous repositories and various
sensors that are subscribed to the system. The objective of this layer is data
acquisition, cleansing and classification using standard approaches such as
APIs or OGC (Open Geospatial Consortium) compliant web services. Existing tools
like TheDataTanka and CKANb for data access, transformation and publishing
(e.g. XML, CSV, JSON or binary structures such as SHP files or relational
database) in a RESTful way can be utilised. For data storage Cassandra
(un/semi-structured - no SQL), PostgreSQL (relational structured data) and
Virtuoso RDF store are selected. However, detailed design and prototype of the
bottom two tiers is not within the scope of this paper and is partly covered in and rest is a work in progress.
The resource
data mapping and linking layer (middle layer) finds new scenarios and supports
workflows to develop relations that were not possible in the isolated data
repositories. However it is likely that collected data will be in a number of
different formats and semantics due to heterogeneous data sources and hence can
benefit from data linking. For example, linked data or open data where
databases can be browsed to serve queries and find events of interest that were
not possible without the availability of linked data. Furthermore, semantic
data model can be developed as a layer on top of the linked data to make sense
of everything. Once the meta-data of heterogeneous data sources has been
populated into meta-data stores, mappings are established between the
resources, links are generated and the data is made semantically relevant and
browse-able. This data browsing can help end users to select different
cross-thematic indicators and variables to perform analytics. Existing metadata
formats (such as the European Data Model, Talis Aspire, the Open Library and
DBLP as Linked Data) are preferable choices to describe and store meta-data
extracted from different sources. The data is then mapped using standardised
resource description semantics, e.g. via an RDF store (e.g. Virtuoso DB) which
has all the necessary links established between artefacts and resources. In
case of linked services, higher level services and mashups can be composed to
browse and make use of this data for interesting scenarios. SPARQL, an RDF
query language, then can be used to retrieve and manipulate data stored in
Resource Description Framework format.
An analytic
engine in top layer processes the data for application specific purposes. The
engine utilizes the data that is available in the linked data layer and helps
users in submitting queries, application specific algorithms and workflows to
find information from the data repositories. In this respect, Big Data Mining
is recently a new trend used to identify large data sets due to complexity,
cardinality and continuality. Big Data Mining techniques are
increasingly becoming an important and effective way in various data driven
applications such as network traffic risk analysis, business data analysis etc.
These techniques will be extremely useful to generate non-obvious relations and
associations from huge data available from public services of smart future
cities.
Since the main
focus of this paper is smart city data analytics, we’ll mainly focused on the
analytic engine and tried to explain in detail. For analytic engine, various
statistical modeling, machine learning and data mining techniques can be
applied. Also, existing tools such as RapidMiner and R in combination with
Hadoop MapReduce can be utilized to mine the city data at scale. In
literature Big data mining is considered much more and complex than traditional
data mining currently in practice. This is true for smart city data
analytics because multi-disciplinary nature of city data can help in
formulating a variety of city application scenarios. In this regard, some of
the possible components for cloud based big data mining or analysis can be;
- Data processing / integration, classification,
- Clustering, data reduction,
- Visualization, and finding association rules as depicted in Figure 3.
It is not
necessary to use all these components. Depending upon the application, subsets
of components may be needed for data analysis. For example, for the open data
use case, algorithms from only two components i.e. data processing and finding
association rules are needed. All these components are well known components in
data mining. Furthermore, these components can benefit from
state-of-the-art tools such as Apache Mahout and R for cluster based scalable
machine learning.
Conclusion:
Smart cities
provide an opportunity to connect people and places using innovative
technologies that helps in better city planning and management. At the core of
smart cities are the collection, management, analysis and visualization of huge
amount of data that is generated every minute in an urban environment due to
socioeconomic, anthropogenic or natural environmental events or other
activities. Smart cities data can be collected directly from variety of
sensors, smart phones, citizens and integrated (or linked) with city data
repositories to perform analytical reasoning and generate required information
(e.g. for end users) or new knowledge for decision-making for better urban
governance. Innovations in information and communication technological provide
the opportunity to manage and process smart city data and provide timely and
necessary information to relevant stakeholders for decision making.
About The Authors:
Zaheer Khan and
Kamran Soomro - Faculty of Environment and Technology, Department of Computer
Science and Creative Technologies, University of the West of England, Bristol,
UK.
Ashiq Anjum - Faculty
of Business, Computing and Law, School of Computing and Mathematics, University
of Derby, Derby, UK.
Muhammad Atif
Tahir- School of Computer Science and Digital Technologies, University of
Northumbria, NE1 8ST, Newcastle upon Tyne, United Kingdom.
Publication Details:
This article is
an extract from a technical paper -"Towards cloud based big data analytics for
smart future cities by Zaheer Khan, Ashiq Anjum, Kamran Soomro and Muhammad
Atif Tahir", originally published at Journal of Cloud Computing: Advances,
Systems and Applications (2015) 4:2 , DOI 10.1186/s13677-015-0026-8
© 2015 Khan et
al.; licensee Springer. This is an Open Access article distributed under the
terms of the Creative Commons Attribution License
Download The Paper - LINK