Data Mining Techniques and Their
Applications
Deepika Sattu, 800721246, dsattu@uncc.edu
Abstract— Data mining is logical process that is used to extract or “mining” large amount of data in order to find useful data [2]. Knowledge discovery from Data or KDD is synonym for Data Mining[13].There are many different types of techniques that can be used to retrieve information from large amount of data. Each type of technique will generate different results. The type of data mining technique that should be selected depends on the type of business problem that we are trying to solve.
Keywords: Clustering, Decision Trees, Classification,
Prediction
I. INTRODUCTION
Data is very critical for any organization. In an organization every by year massive amounts of data will be created and how fast your business reacts to that important information determines whether you succeed or fail. The big problem is how we efficiently handle the 3 V’s of Big Data.
3V’s of Big Data are Volume, Velocity, and Variety.
Volume: Amount of Data
Velocity: Speed at which the data is being processed.
Variety: Usage of data in various forms. i.e., graph, tree, nodes. [14]
Now a day’s data is in Exa Bytes and Zeta Bytes. It is impossible to manually analyze and extract data. In some Clusters data is increasing where as in others it is decreasing. There are various Data Mining techniques such as Association, Clustering, and Prediction that is used to retrieve data from Large databases (Big
Data Mining. It is the process of discovering interesting knowledge that are gathered and significant structures from large amounts of data stored in data warehouse or other information storage.
With the growth of technology, there is a substantial growth of data by volume, variety and velocity satisfying the criteria of Big Data . Volume of data already exceeded 100 EB at the end of 1990s, reached 1.8 ZB at 2011, and we have already entered in the age of ZB. By 2020, it was forecasted that the volume of data will be 50 times bigger than the one at 2011 . Big data is used to describe the huge data sets (terabyte to Exabyte) and Big Data analytics are the techniques applied on them.
Today with the ever growing use of computers in the world, information is constantly moving from one place to another. What is this information, who is it about, and who is using it will be discussed in the following paper. The collecting, interpreting, and determination of use of this information has come to be known as data mining. This term known as data mining has been around only for a short time but the actual collection of data has been happening for centuries. The following paragraph will give a brief description of this history of data collection.
Businesses using data is not a new concept; however, the role of data within industries has increased dramatically over the years to the point that it is essential for a business to understand how to handle data in order to continue operations. In today’s bustling digital age, professionals credit a certain type of data called “big data” with helping businesses gain insight on consumers. Big data is created whenever you travel to your favorite restaurant, make a particular move in a video game, swipe your card to purchase your favorite pair of Crocs, or tell your Facebook friends what you had for breakfast. It is data that is too large to be captured and processed by standard business
In its infancy, data mining was as limited as the hardware being used. Large amounts of data were difficult to analyze because the hardware simply could not handle it [1]. The term "data mining" first began appearing in the 1980 's largely within the research and computer science communities. In the 1990 's it was considered a subset of a process called Knowledge Discovery in Databases of KKD [1]. KKD analyzes data in the search for patterns that may not normally be recognized with the naked eye. Today however, data mining does not limit itself to databases,
Data mining is having such techniques and algorithms that helpful in finding analytics from given data set, and helps in finding association
"Data mining is a rather new term for a challenge that has been growing for many years: how to scan very large databases to retrieve the high level conceptual information of the greatest interest" (Lindsay). With the advances in data acquisition and storage technologies, the problem of how to turn measured raw data into useful information becomes a important one. Having reached sizes that defy even partial examination by humans, the data volumes are literally swamping users. For example, large US retail chains now mine their data bases with sophisticated data mining programs to look for general trends and geographic clustering in purchases that are not easily visible in the huge multitude of products and sales.
Big data is a popular term used to describe the exponential growth and availability of data, both structured and unstructured. And big data may be as important to business – and society – as the Internet has become. Why? More data may lead to more accurate analyses. More accurate analyses may lead to more confident decision making. And better decisions can mean greater operational efficiencies.
Data mining is the procedure of getting new patterns from large amount of data. Data mining is a procedure of finding of beneficial information and patterns from huge data. It is also called as knowledge discovery method, knowledge mining from data, knowledge extraction or data/ pattern analysis. The main goal from data mining is to get patterns that were already unknown. The useful of these patterns are found they can be used to make certain decisions for development of their businesses. Data mining aims to discover implicit, already unknown, and potentially useful information that is embedded in data.
TITLE A Big Data is fast becoming a ubiquitous term in the world of computers – but what does it actually mean? Explain the fundamental principles of Big Data and discuss the impact it is having, and may continue to have, on modern computing. What challenges does the model bring and in what ways can these be resolved?
Data mining is defined as the examining of large databases of information in order to generate new information. Nearly every transaction or interaction leaves a data signature that is captured and stored. Data mining is a means of automating the process of analyzing the patterns of data according to different categories. The information is then sorted, collected and assembled into data warehouses for more efficient analysis by algorithms and used to facilitate business decisions. Sophisticated mathematical algorithms are used to segment the data and evaluate the probability of future events.
Data has always been analyzed within companies and used to help benefit the future of businesses. However, the evolution of how the data stored, combined, analyzed and used to predict the pattern and tendencies of consumers has evolved as technology has seen numerous advancements throughout the past century. In the 1900s databases began as “computer hard disks” and in 1965, after many other discoveries including voice recognition, “the US Government plans the world’s first data center to store 742 million tax returns and 175 million sets of fingerprints on magnetic tape.” The evolution of data and how it evolved into forming large databases continues in 1991 when the internet began to pop up and “digital storage became more cost effective than paper. And with the constant increase of the data supplied digitally, Hadoop was created in 2005 and from that point forward there was “14.7 Exabytes of new information are produced this year" and this number is rapidly increasing with a lot of mobile devices the people in our society have today (Marr). The evolution of the internet and then the expansion of the number of mobile devices society has access to today led data to evolve and companies now need large central Database management systems in order to run an efficient and a successful business.
Data mining and knowledge discovery is the name frequently used to refer to a very interdisciplinary field, which consists of using methods of several research areas to extract knowledge from real-world datasets. There is a distinction between the terms data mining and knowledge discovery which seems to have been introduced by [Fayyad et al.1996].the term data mining refers to the core step of a broader process, called knowledge discovery in database. Architecture of data mining structure is defined the following figure.
In modern times, the amount of data being stored is terrifically large. Companies must deal with such abundance of data on a daily basis in both storing and analyzing as
The Data mining it also be known as that the way of picking the data and from big mix of Information from the cloud. And it can also be say’s like it’s a data mining is digging or extracting knowledge from the data.