Data Mining is the computational process of discovering patterns in large data sets involving methods at the intersection of artificial intelligence, machine learning, statistics, and database systems. It is an interdisciplinary subfield of computer science. The overall goal of the data mining process is to extract information from a data set and transform it into an understandable structure for further use. Aside from the raw analysis step, it involves database and data management aspects, data pre-processing, model and inference considerations, interestingness metrics, complexity considerations, post-processing of discovered structures, visualization, and online updating. The term is frequently applied to any form of large-scale data or information processing (collection, extraction, warehousing, analysis, and statistics) as well as any application of computer decision support system, including artificial intelligence, machine learning, and business intelligence. The actual data mining task is the automatic or semi-automatic analysis of large quantities of data to extract previously unknown, interesting patterns such as groups of data records (cluster analysis), unusual records (anomaly detection), and dependencies (association rule mining, sequential pattern mining).
MSE – mean squared error
Root cause analysis
Binary hipotesis test
Null hypotesis (H0)
Alternative Hypotesis (H1)
Type I error
Type II error
Principal Component Analysis
Support Vector Machines
Maximal margin classifier
DATA SCIENCE FAQ
DATA SCIENNCE QUESTIONS
DATA SCIENCE DICTIONARY
DATA SCIENCE WIKI
RECCURENT NEURAL NETWORK
Machine learning is subfield of science, that provides computers with the ability to learn without being explicitly programmed. The goal of machine learning is to develop learning algorithms, that do the learning automatically without human intervention or assistance, just by being exposed to new Data Science The machine learning paradigm can be viewed as “programming by example”. This subarea of artificial intelligence intersects broadly with other fields like, statistics, mathematics, physics, theoretical computer science and more.
What is definition of Machine Learning?
Machine Learning subfield of science that provides computers with the ability to learn without being explicitly programmed. The goal of Machine Learning is to develop learning algorithms that do the learning automatically without human intervention or assistance, just by being exposed to new data. The Machine Learning paradigm can be viewed as “programming by example”. This subarea of artificial intelligence intersects broadly with other fields like statistics, mathematics, physics, theoretical computer science and more.
Machine Learning can play a key role in a wide range of critical applications, such as data mining, natural language processing, image recognition, and expert systems. Machine Learning can be a game changer in all these domains and is set to be a pillar of our future civilization. If one wants a program to predict something, one can run it through a Machine Learning algorithm with historical data and “train” the model, it will then predict future patterns. Machine Learning is quite vast and is expanding rapidly, into different sub-specialties and types.
Examples of Machine Learning problems include, “Is this car?”, “How much is this house worth?”, “Will this person like this movie?”, “Who is this?”, “What did you say?”, and “How do you fly this thing?”. All of these problems are excellent targets for a Machine Learning project, and in fact, it has been applied to each of them with great success.
Among the different types of Machine Learning tasks, a crucial distinction is drawn between supervised and unsupervised learning:
Supervised machine learning: The program is “trained” on a pre-defined set of “training examples”, which then facilitate its ability to reach an accurate conclusion when given new data.
Unsupervised machine learning: The program is given a bunch of data and must find patterns and relationships between them.
Supervised Machine Learning
In the majority of supervised learning applications, the ultimate goal is to develop a finely tuned predictor function. “Learning” consists of using sophisticated mathematical algorithms to optimize this function so that, given input data about a certain domain, it will accurately predict some interesting value. The goal of Machine Learning is not to make “perfect” guesses, but to make guesses that are good enough to be useful.
Many modern Machine Learning problems take thousands or even millions of dimensions of data to build predictions using hundreds of coefficients.
The iterative approach taken by Machine Learning algorithms works very well for multiple problems, but it doesn’t mean Machine Learning can solve any arbitrary problem, it can’t, but it is very a powerful tool in our hands.
In supervised learning, there are two categories of problems:
Regression – the value being predicted is continuous, it answers questions like: “How much?” or “How many?”
Classification – yes-or-no prediction, categorical answer, Eg. “Is this cat?”, “Is this product category x?”.
The underlying theory is more or less the same, differences are the design of the predictor and the design of the cost function.
Unsupervised Machine Learning
Unsupervised learning typically is tasked with finding relationships within data. There are no training examples, the system is given a set data and tasked with finding patterns. A good example is identifying groups of friends in social network data. The algorithms used to do this are different from those used for supervised learning.
Machine Learning is an incredibly powerful tool, it will help to solve some of the human most burning problems, as well as open up whole new opportunities.