Data mining in e-commerce is a vital way of repositioning the e-commerce company for supporting the enterprise with the required information concerning the business. Recently, most companies adopt e-commerce and being in possession of big data in their data repositories. The only way to get the most out of this data is to mine it to increase decision making or to enable business intelligence.
By Mustapha
Ismail, Mohammed Mansur Ibrahim, Zayyan Mahmoud Sanusi and Muesser Nat
Management
Information Systems Department
Cyprus International University, Haspolat,
LefkoÅŸa via Mersin, Turkey
Data mining in
e-commerce is a vital way of repositioning the e-commerce company for
supporting the enterprise with the required information concerning the
business. Recently, most companies adopt e-commerce and being in possession of
big data in their data repositories. The only way to get the most out of this
data is to mine it to increase decision making or to enable business
intelligence. In e-commerce data mining there are three important processes
that data must pass before turning into knowledge or application.
The first and
easier process of data mining is data pre-processing and it is actually a step
before the data mining, whereby, the data is cleaned by removing the unwanted
data that has no relation with the required analysis. Hence, the process will
boost the performance of the entire data mining process and the accuracy of the
data will also be high and the time needed for the actual mining will be
minimize reasonably. Usually this happens if company already have an existing
target data warehouse, but if not then the process will consume at least 80% of
the selection, cleaning and transformation of data termed as pre-processing.
Mining pattern
is the second step and it actually refers to techniques or approach used to
develop a recommendation rules, or developing a model out of a large data set.
It can also be referred as techniques or algorithms of data mining. The most
common patterns used in e-commerce are prediction, clustering and association
rules. The purpose of third step which is pattern analysis is to verify and
shade more light on the discovered model in order to give a clear path for the
startup up for applying of the data mining result. The analysis lay much
emphasis on the statistics and rules of the pattern used, by observing them
after multiple users have accessed them.
However all
this has to do with how iterative the overall process is, and the
interpretation of visual information you get at each sub step. Therefore, in
general data mining process iterates from the following five basic steps, which
are:
• Data
selection: This step is all about identifying the kind of data to be mined, the
goals for it and the necessary tool to enable the process. At the end of it the
right input attributes and output information in order to represent the task
are chosen.
• Data
transformation: This step is all about organizing the data based on the
requirements by removing noise, converting one type of data to another,
normalizing the data if there is need to, and also defining the strategy to
handle the missing data.
• Data mining
step per se: Having mined the transformed data using any of the techniques to
extract pattern of interest, the miner can also make data mining method by
performing the proceeding steps correctly.
• Result
interpretation and validation: For better understanding of data and it synthesized knowledge together with its validity span, the robustness is check
by data mining application test. The information retrieved can also be
evaluated by comparing it with the earlier expertise in the application domain.
•
Incorporation of the discovered knowledge: This has to do with presenting the
result of discovered knowledge to decision maker so that it is possible to
compare or check/resolve for conflict with an earlier extracted knowledge where
a new discovered pattern can be applied
Benefits
Application of
data mining in e-commerce refers to possible areas in the field of e-commerce
where data mining can be utilised for the purpose of enhancements in business.
As we all know while visiting an online store for shopping, users normally
leave behind certain facts that companies can store in their database. These
facts represent unstructured or structured data that can be mined to provide a
competitive advantage to the company. The following areas are where data mining
can be applied in the field of e-commerce for the benefits of companies:
1) Customer
Profiling- This is also known as customer-oriented strategy in e-commerce. This
allows companies to use business intelligence through the mining of customer’s
data to plan their business activities and operations as well as develop new
research on products or services for prosperous e-commerce. Classifying the
customers of great purchasing potentially from the visiting data can help
companies to lessen the sales cost.
Companies can
use users’ browsing data to identify whether they purposefully shopping or just
browsing or buying something they are familiar with or something new. This
helps companies to plan and improve their infrastructure.
2)
Personalization of Service- Personalization is the act to provide contents and
services geared to individuals on the basis of information of their needs and
behavior. Data mining research related to personalization has focused mostly on
recommender systems and related subjects such as collaborative filtering.
Recommender systems have been explored intensively in the data mining
community. These systems can be divided into three groups: Content-based,
social data mining and collaborative filtering. These systems are cultured and
learned from explicit or implicit feedback of users and are usually represented
as the user profile. Social data mining, in considering the source of data that
are created by the group of individuals as part of their daily activities, can
be important source of important information for companies. Contrarily, personalization
can be achieved by the aid of collaborative filtering, where users are matched
with particular interest and in the same vein the preferences of these users to
make recommendations.
3) Basket
Analysis - Every shopper’s basket has a story to tell and market basket
analysis (MBA) is a common retail, analytic and business intelligence tool that
helps retailers to know their customers better. There are different ways to get
the best out of market basket analysis and these include:• Identification of
product affinities; tracking not so apparent product affinities and leveraging
on them is the real challenge in retail. Walmart customers purchasing Barbie
dolls shows an affinity towards one of three candy bars, obscure connection
such as this can be discovered with an advanced market basket analytics for
planning more effective marketing efforts.
• Cross-sell
and up-sell campaigns; these shows the products purchased together, so
customers who purchase the printer can be persuaded to pick up high quality paper
or premium cartridges.
• Planograms
and product combos; are used for better inventory control based on product
affinities, developing combo offers and design effective user friendly
planograms in focusing on products that sells together.
• Shoppers
profile; in analyzing market basket with the aid of data mining over time to
get a glimpse of who your shoppers really are, gaining insight to their ages,
income range, buying habits, likes and dislikes, purchase preferences, levering
this and giving the customer experience.
4) Sales
Forecasting - Sales forecasting involves the aspect of the time an individual
customer spend to buy an item and in this process trying to predict if the
customer will buy again. This type of analysis can be used to determine a strategy
of planned obsolescence or figure out complimentary products to sell. In sales
forecasting, cash flow can be projected into three which include the
pessimistic, optimistic and the realistic. This helps to have a plan on the
adequate amount of capital available to endure the worst possible scenario that
is if sales do not go actually as planned.
5) Merchandise
Planning - Merchandise planning is useful for both online and offline retail
companies. In the case of online business, merchandise planning will help to
determine stocking options and the inventory warehousing, while in the case of
offline companies, business that are looking to boost by adding stores can
assess the required amount of merchandise they will be adequately needing by
having a foresight at the exact layout of the current store. Using the
right approach to merchandise planning will definitely lead to answers on what
to do with:
• Pricing: the
aspect of database mining will help determining the suited best price of
products or services in the processes of revealing customer sensitivity.
• Deciding on
products; data mining provides e-commerce businesses with the aspect of which
products customers actually desire, which includes the aspect of intelligence
on competitor’s merchandise.
• Balancing of
stocks; in mining the retail database, it helps determine the right and
specific amount of stocks needed i.e. not too much and not too less, throughout
the business year and also during the buying seasons.
6) Market
Segmentation - Customer segmentation is one of the best uses of data mining.
From the lots of data gotten, it can be broken down into different and
meaningful segments like income, age, gender, occupation of customers, and this
can be used when either the companies are running email marketing campaigns or
SEO strategies. The aspect of market segmentation can also help a company
identify its own competitors. This provided information alone can help the
retail company identify that the periodic respondents are usually not the only
ones pointing the same customer money as the present company is.
Segmenting the
database of a retail company will improve the conversion rates as the company
can focus their promotion on a close-fitted and highly wanted market. This also
helps the retail company to understand the competitors that are involved in
each and every segment in the process permitting the customization of products
that will actually satisfy the target audience in a generic way.
Challenges:
Besides the
benefits data mining provides challenges for e-commerce companies, which are as
follows:
1) Spider
Identification - As it is commonly known main aim of data mining is to convert
data into useful knowledge. Main source of data for e-commerce companies is web
pages. Therefore, it is critical for e-commerce companies to understand how
search engines work to follow how quickly things happen, how they happen and
when changes will show up in the search engines. Spiders are software programs
that are sent out by the search engine to find new information. These spiders
can also be called as bots or crawlers. It is a software program that search
engine uses to request pages and download them, it comes as a surprise to some
people, however what the search engine does is they use a link of an existing
website to find a new website and request a copy of that page to download it to
their server. This is what the search engines use to run the ranking algorithm
against and that is what shows up in the search engine result page. Therefore,
the challenge here is that the search engines need to download a correct copy
of the website. E-commerce website needs to be readable and see able and the
algorithm is applied to the search engines database. Tools are needed to have
the mechanisms to enable them automatically remove unwanted data that will be
transformed to information in order for data mining algorithm to provide
reliable and sensible output.
2) Data
Transformations - In this case data transformation pose a challenge for data
mining tools. Today, the data needed to transform can only be gotten from two
different sources, one of which an active and operational system for the data
warehouse to be built and secondly it should include some activities that
involves assigning new columns, binning data and also aggregating the data as
well. In the first process, it is needed to be modified infrequently that is
only when there is a change in the site and lastly the set of the transformed
data gives a significantly great challenge in the data mining process.
3) Scalability
of Data Mining Algorithms - With yahoo which has over 1.2 billion page views in
a day with the presence of large amount of data, scalability arises with
significant issues;• Due to the large amount of data size gathered from the
website at a reasonable time, the data mining algorithm can handle or process
it as much as it’s needed especially because of the scale nonlinearly.• The
models that are generated tends to be too complicated for individuals to understand
how it is interpreted.
4) Make Data
Mining Models Comprehensible to Business Users - The results of data mining
should be clearly understood by business users, from the merchandisers who are
in charge of decision making to the creative designers that design the sites to
marketers to spend advertising money. The challenge is to design and define
extra model types and a strategic way to present them to business users, what
regression models can we come up with and how can we present them? (Even linear
regression is usually hard for business users to understand.) How can we
present nearest-neighbour models, for example? How can we present the results
of association rule algorithms without overwhelming users with tens of
thousands of rules?
5) Support
Slowly Changing Dimensions- The demographic aspect of visitors change, in that
they may get married, there is an increase in salaries or income, the rapid
growth of their children, needs which are the bases on which it is modelled
changes. Thus, the products attributes also change, in terms of new choices may
be available, the design and the way the products or service is packaged and
also the increase or degrade of quality. These attribute that change over time
are often known as “Slowly Changing Dimensions”. In this case the main
challenge here is to keep track of those changes and in the same vein providing
support for the identified change in the analysis.
6) Make Data
Transformation and Model Building Accessible to Business Users - Having the
ability to provide definite answers to questions by individual business users,
this requires the aspects of data transformations but with the technical
understanding of the tools used in the analysis. Many commercials report
designers and also online analytical processing (OLAP) tools are basically hard
to understand by business users. In this case, two preferred solutions are
- Provision of templates, (e.g. online analytical processing cubes and recommended transformations for mining) for the expected questions and
- Provision of the experts via consultation or even a service organization. This mentioned challenge basically is to find a way to enrich the business users to as to be able to analyze the information themselves without and hiccups.
Copyright © 2015 by authors and
Scientific Research Publishing Inc.
This work is licensed under the Creative
Commons Attribution International License (CC
BY).http://creativecommons.org/licenses/by/4.0/