Data produced through sensors is increasing at very exponential rate from sensors in IoT. Heterogeneity, scale, timeliness, complexity, and privacy problems with large set of data impede progress at all phases of the pipeline that can create value from data
By Sapna Tyagi(1), Ashraf Darwish (2), Mohammad Yahiya Khan (3)
Image Attribute: Data map / Source: jisc.ac.uk, Creative Commons attribution information
The physical world is becoming a type of information system. In Internet of Things (IoT), sensors, actuators, Radio Frequency Identification (RFID)Tags are embedded in physical objects—from roadways to pacemakers and placed on products moving through supply chains, thus improving inventory management while reducing working capital and logistics costs are linked through wired and wireless networks, often using the same Internet Protocol (IP) that connects the Internet. These networks churn out huge volumes of data that flow to computers for storage and analysis. Inventory goods are monitored using RFID tags, hospital patients are managed using RFID tags, and parking place availability is managed using a range of sensors. Internet-connected cars, sensors on raw food products, sensors on packages of all kinds, data streaming in from the unlikeliest of places: restrooms, kitchens, televisions, personal mobile devices, cars, gasoline pumps, car washes, refrigerators, vending machines, and SCADA systems. Professionals are highly motivated to harness this big data into an asset for the organization. This requires tremendous storage and computing resources linked with advanced software systems that generate a variety of graphical displays for analyzing data—rise accordingly.
The IoT is causing the number and types of products to emit data at an unprecedented rate. Companies use this data to analyze and improve processes, predict trends and failure. The data can also provide insights for product development, customer support, operations, and sales teams who use the information to improve their features, increase revenues, lower costs and more. Our solution is built on a proven framework and tested methodology. It has generated tremendous results for leading manufacturing, high technology, energy and telecommunications companies.
Characteristics of Data in “Internet of things”
The volume of information deriving from the tags is substantial generating large data. Velocity suggests that information is being generated at a rate that exceeds those of traditional systems. There are a variety of different types of information available to monitor. Variety is indicative of there being multiple emerging forms of data that are of interest to enterprises [1] .
For example, Twitter and other social media have become a source of big data. In mid-2010, Twitter tweets hit 65 million per day and there were 190 million users [2] [3] . The “Internet of Things” can generate large data for a number of reasons. The volume of data attributable to the “Internet of Things” is substantial. As sensors interact with the world, Things such as RFID tags generate volumes and volumes of data.
As a result, digital processing becomes a requirement of feasibility. The velocity of data associated with the “Internet of Things”, compared with traditional transaction processing, explodes as sensors can continuously capture data. The variety of data associated with the “Internet of Things” also depends on as the types of sensors and the different sources of data expand. Variety deals with the complexity of large data and information and semantic models behind these data. Thus data collected is in the form of structured, unstructured, semi-structured, and a mixed data. Data variety imposes new requirements to data storage and database design which should dynamic adaptation to the data format.
Veracity ensures that the data used are trusted, authentic and protected from unauthorized access and modification. The data must be secured during the whole their life-cycle from collection from trusted sources to processing on trusted compute facilities and storage on protected and trusted storage facilities. The veracity of data in the “Internet of Things” may also be improved as the quality of sensor and other data improves over time.
For example, use of RFID tags generates much more reliable information than a decade ago. Such high volumes of data, coupled with an increasing velocity of data, along with an increased variety of data generates large amount of raw data which need analytical processing to create its value. Variability or data dynamicity refers to change in data while processing or analyzing.
Challenges
Data produced through sensors is increasing at very exponential rate from sensors in IoT. Heterogeneity, scale, timeliness, complexity, and privacy problems with large set of data impede progress at all phases of the pipeline that can create value from data. As data is increasingly becoming more varied, more complex and less structured, it has become imperative to process it quickly. Meeting such demanding requirements poses an enormous challenge for traditional databases. It Need to consolidate e-Infrastructures platforms to ensure research continuity and cross-disciplinary collaboration, deliver/offer persistent services, with adequate governance modelIt required to upgrade the architectures that address these needs. Such enormous data fundamentally requires massively distributed architectures and massively parallel processing to manage and analyze data.
The three main categories under which management of this massive data lies are populating this huge IoT data, querying the database and managing the database. Above this one more challenge which lies above all is communication of the data. Communication cost is much higher than processing cost.
The challenge here is to minimize that communication cost while satisfying the additional storage and data requirement. Bandwidth and latency are the two major network features that will affect the communication between the clients and the data server.
Download the Paper - LINK
About the Authors:
Sapna Tyagi (1): Institute of Management Studies, Ghaziabad, UP, India
Ashraf Darwish (2): Faculty of Science, Helwan University, Cairo, Egypt
Mohammad Yahiya Khan (3): College of Science, King Saud University, Riyadh, Saudi Arabia
References:
[1] Zikopoulos, P., DeRoos, D., Parasuraman, K., Deutsch, T., Corrigan, D. and Giles, J. (2013) Harness the Power of Big Data McGraw-Hill.
[2] Schonfeld, E. (2010) Costolo: Twitter Now Has 190 Million Users Tweeting 65 Million Times a Day. http://techcrunch.com/2010/06/08/twitter-190-million-users/
[3] Spangler, S., Chen, Y., Proctor, L., Lelecu, A., Behal, A., He, B. and Davis, T. (2009) COBRA—Mining Web for Corporate Brand and Reputation Analysis. Web Intelligence and Agent Systems, 7, 243-254.
Publication Details:
This article is an excerpt from a technical paper, titled - " Managing Computing Infrastructure for IoT Data" published at Advances in Internet of Things Vol.4 No.3(2014), Article ID:48229,7 pages DOI: 10.4236/ait.2014.43005
Copyright © 2014 by authors and Scientific Research Publishing Inc.
This work is licensed under the Creative Commons Attribution International License (CC BY).http://creativecommons.org/licenses/by/4.0/