Deep Learning (also known as deep structured learning, hierarchical learning or deep machine learning) is a class of machine learning algorithms that: use a cascade of many layers of nonlinear processing units for feature extraction and transformation. Each successive layer uses the output from the previous layer as input. The algorithms may be supervised or unsupervised and applications include pattern analysis (unsupervised) and classification (supervised). These algorithms are based on the (unsupervised) learning of multiple levels of features or representations of the data. Higher level features are derived from lower level features to form a hierarchical representation. They are part of the broader machine learning field of learning representations of data. They learn multiple levels of representations that correspond to different levels of abstraction; the levels form a hierarchy of concepts. In a simple case, there might be two sets of neurons: one set that receives an input signal and one that sends an output signal. When the input layer receives an input it passes on a modified version of the input to the next layer

## deep learning,data mining,machine learning,artificial intelligence,kaggle,tensor flow,data scientist,neural network,what is machine learning,machine learning algorithms

Regularization

Multiple regression

Model validation

Precision

Recall

ROC curve

Predictive model

Overfitting

Loss function

L1/L2 Regularization

Response variable

Estimation

Multi-Collinearity

Resampling

Jackknife resampling/Jacknifing

MSE – mean squared error

Selection bias

Local Max/Min/Optimum

A/B Testing

Web Analytics

Root cause analysis

Big data

Data minig

Binary hipotesis test

Null hypotesis (H0)

Alternative Hypotesis (H1)

Statistical Power

Type I error

Type II error

Bootstrapping

Cross-Validation

Ridge regression

Lasso

K-means clustering

Semantic Indexing

Principal Component Analysis

Supervised learning

Unsupervised learning

False positives

Fase negatives

NLP

Feature vector

Random forrest

Support Vector Machines

Colaborative Filtering

N-grams

Cosine distance

Naive Bayes

Boosted trees

Decision tree

Stepwise regression

Intercept

Impurity measures

Maximal margin classifier

Kernel

Kernel trick

Dimensionality reduction

Dimensionality course

Newton’s method

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

DATA SCIENCE FAQ

DATA SCIENNCE QUESTIONS

DATA SCIENCE DICTIONARY

DATA SCIENCE WIKI

MACHINE LEARNING

DEEP LEARNING

PREDICTIVE ANALYTICS

ARTIFICIAL INTELLIGENCE

NEURAL NETWORKS

RECCURENT NEURAL NETWORK

CNN

RNN

LSTM

(AI) Accuracy

(capital Association

(DAG) Attribute

(DAG). Categorical

(database, Continuous

(error/loss) A

(example, Classifier

(field, Confusion

(graph The

(incorrect) Cost

(iterative Cross-validation

(JSON)A Data

(Katz) Dimension

(LP). Error

(mathematics). Feature

(mean) Inducer

(MOLAP, Instance

(most Knowledge

(multi-dimensional Loss

(numbers Machine

(often In

(or Missing

(PageRank) Model

(pdf) OLAP

(record, On-Line

(see Record

(SNA) see

(sometimes Regressor

(SRS) Resubstitution

(the Schema

(UDF) Sensitivity

(x_{1},x_{2}, True

[ALPHA] Specificity

[BETA] Supervised

[DEPRECATED] Techniques

\frac Tuple

\frac{(x-i Unsupervised

\frac{1}{ Adjacency

\hat Aggregation

\le avg

\mu count

\mu)^{2}}{2i count_distinct

\mu, max

\pi}} min

\sigma stdev

\sigma) sum

\sigma^2}} var

\sqrt{2 Alpha

\sum_{i=1}^{n} Alternating

{x_{i} API

“bias” Functions

“noise” [ALPHA]

“prior” [BETA]

“system” [DEPRECATED]

“The Arity

“variance” ASCII

= Abbreviated

A Average

A. Bayesian

A/B Contrast

Abbreviated For

Accuracy Belief

accuracy, Beta

Acyclic Bias

Adapted Bias-variance

adding Central

additional Centrality

Adjacency From

Aggregation Character-Separated

algorithm Classification

algorithm. Clustering

Algorithmic Collaborative

algorithms Comma-Separated

algorithms, Community

all Plural

Allocation Conjugate

Allocation: Trusted

allow Connected

allows Convergence

Alpha Where

also CSV

Alternating Degree

American Deprecated

amounts Directed

An ECDF

Analysis Edge

analysis, Empirical

Analytical \hat

Analytics \sum_{i=1}^{n}

AnalyticsAnalysis Enumerate

analyze Verb

and Equal

anonymous Extract,

answering Extracts

any Transforms

Apache Loads

API ETL

approach F-Measure

approaches F-Score

are F1

Arity float32

around float64

article Frame

Artificial GaBP

As Gaussian

as: Normal

ASCII will

Association is

assumption approaches

at f(x,

Attribute e^{-i

attributes \mu

automatically \sigma

Average Global

avg Graph

B Traversals

based Statistics

Bayesian Some

be As

because left:

behavior right:

Belief inner:

bell-shaped So

best-fitting HBase

Beta Apache

between Hyperparameter

Bias Parameter

Bias-variance int32

Big An

binary int64

Binning Ising

bits You

Both JSON

branch K-S

broad Katz

by Label

C Labeled

calculation Many

calculation) Lambda

calling Adapted

can This

case Further

case, Related

Categorical Warning

category Latent

centered [A]

Central Least

Centrality LineFile

Centrality. Local

Centrality: Loopy

Characteristic MapReduce

Characteristic: Markov

Character-Separated Online

class OLTP

Classification PageRank

Classifier Precision/Recall

cleaning/cleansing Property

Clickstream Python

Clustering Related:

Clustering. Quantile

Code One

code. RDF

Coefficient Receiver

Coefficient. Recommendation

Collaborative Recommender

collection Resource

column ROC

columns Row

Comma-Separated Refer

common Semi-Supervised

commonly Simple

Commonly, Smoothing

Community Wikipedia:

complex Stratified

Component Superstep

computed Supersteps

computer Tab-Separated

computing, Topic

conditional Transaction

Confusion Transactional

Conjugate UDF

conjunctive Undirected

Connected Unicode

connection Vertex

consisting A/B

containing B

context, Big

Continuous C

Contrast Clickstream

Convergence D

correct DatabaseA

corresponding F

Cost Federated

Cost. G

count Geolocation

count_distinct I

counting IngestionThe

Cross-validation J

CSV Javascript

Cumulative K

curve Key

Customer L

customers Live

D ?

Data N

data, O

DataAggregated Omnichannel

Database P

database, PortabilityAbility

DatabaseA R

databases S

DataData ScalabilityAbility

DataDevice T

defined Third

defines –

Degree learning

Degree: (field,

dependencies matrix

deployment proportion

Deprecated cleaning/cleansing

Depth mining

derived set

Descent rate

describe vector

describing /

description (example,

Detection discovery

deviation value

deviation. structure

different deployment

Dimension use

Directed (MOLAP,

direction Analytical

Directions mapping

Dirichlet description

discovery positive

Discovery, negative

discrete used

displaying List

displays representation

distributed Function

distributed, mathematical

Distribution :

distribution. method

Distribution: Maturity

DistributionA Indicates

e^{-i logic,

each Path

ECDF network

Edge Inference

Edge, probabilistic

either with

Element more

Empirical Propagation

end vs

EngineSoftware trade-off

Enumerate Tendency

Equal typical

Error (Katz)

especially (PageRank)

estimating Centrality.

ETL Values

evenly file

event process

examine Filtering

examples Variables

examples, complex

explanations Matrices

Extract, form

Extracts Gradient

F Analytics

f(x, Component

F) Acyclic

F_{n}(t) mathematics

F1 connection

Factorization Cumulative

fall F_{n}(t)

Feature observations

feature)- Indicator\{x_{i}

features Indicator\{A\}

Federated —

Field. Depth

Fields Width

fields, Transform,

file computing,

Filter it

Filtering systems

find F-Measure.

finding Score

finite real

fit (capital

float table

float32 class

float64 special

F-Measure Distribution

F-Measure. group

For fall

form evenly

format zero

format. \mu,

formed \frac{(x-i

found Random

Frame also

frame’s broad

Framework –

From are

from: Database

F-Score Element

Function that

Function. particular,

Functionality integer

Functions can

Further Test

G statistics,

GaBP Social

Gaussian multi-pass

generative additional

Geolocation machine-learning

Global from:

Gradient anonymous

Graph examples

Graph. term:

graphical term

graphs Dirichlet

graphs. generative

group Squares

grouping algorithm

groups works

Hadoop pattern

has Map

have User-defined

HBase Operating

how signal

Hyperparameter specific

I to

identically computer

identifying means

implements Relaxation

implication Sampling

importance single

Important Modeling

improving models

In Processing

independent Functionality

index TestingAnalysis

indexes DataData

Indicates AnalyticsAnalysis

indicator collection

Indicator\{A\} DataDevice

Indicator\{x_{i} Object

Inducer LearningA

induction DistributionA

Inference MarketingA

inferences EngineSoftware

information of

information, Party

information: correct

IngestionThe variable,

inner: Commonly,

input (record,

Instance induction

instances case,

instances. Cost.

int32 Discovery,

int64 and

integer ROLAP)

integrate Processing.

Intelligence vector.

intelligence. (error/loss)

interactions, Tags

Interchange, mathematics,

interpretation American

intersection Length

into topology,

is graphical

Ising information,

it Variance

iteration context,

iterative Centrality:

J theory

Javascript containing

JSON Clustering.

jumps grouping

K Values.

Katz Detection

Key networks,

key-value learning,

Knowledge Descent

known Platform

Kolmogorov|EM|Smirnov information:

K-S theory,

L calculation

Label information

Labeled x

Lambda =

large \le

largest Binning

Latent into

learn number

learned F)

learning case

learning, between

Learning. centered

Learning: on

LearningA \sigma)

Least \mu)^{2}}{2i

left: Fields

Length Coefficient

like category

limits, Algorithmic

line-delimited. Important

LineFile user-guided

List Directions

list. shorthand,

literal calling

Live input

Load intersection

Load: direction

Loads describe

Local Kolmogorov|EM|Smirnov

logic, reference:

Loopy algorithms

Loss researchers

Machine Stanford:

machine-learning Allocation

made procedure

makes format

manipulate fields,

Many by

Map Precision

map. recognition

mapping Function.

MapReduce Characteristic

MapReduce. or

marketing Framework

MarketingA (iterative

Markov iteration

mathematical refers

mathematics provide

mathematics, science,

Matrices type

matrix study

matrix). that,

Maturity system

max sensor

may Notation

MDA PairA

mean bell-shaped

mean, marketing

mean. move

means DataAggregated

measure (incorrect)

measurement feature)-

Meets finite

member subset

membership measurement

method tuple)

method) independent

methods record)

metric non-trivial

min corresponding

minimizing Usually

mining unlabeled

Missing made

Model (see

model. learn

Modeling which

models largest

MOLAP standard

more result

most Tags.

move section

multidimensional Propagation.

multi-pass “bias”

multiple training,

N tabular

navigates predicting

needs, implements

negative Factorization

negative) (often

Neighborhood see:

network (DAG)

Network, Graph.

Network. either

networks, \frac

nonparametric t\}.

non-trivial specify

Normal places

not Load

Notation outside

number fit

number, commonly

numbers, values,

O two

Object around

observations \sigma^2}}

of mean

often walk-through

OLAP attributes

OLTP interactions,

Omnichannel frame’s

on like

One any

Online Test:

On-Line Analysis

Operating (LP).

operational have

or explanations

organized often

other programming

outside measure

over counting

P System:

PageRank algorithms,

PageRank. reduce

PairA method)

Parameter Processing:

parameter, consisting

particular, Degree:

partitions Distribution:

Party because

Path user

pattern receiving

patterns (JSON)A

performance common

places strategy

Platform data,

Plural uses

PortabilityAbility Customer

positive predictions

positive) specification

Precision several

precision. interpretation

Precision/Recall learned

predicted synonymous

predicting instances

predictions (mean)

prior (most

probabilistic deviation

probability may

problem Code

procedure representing

process Meets

Processing sub-graph

Processing. step

Processing: (x_{1},x_{2},

programming {x_{i}

Propagation indicator

Propagation. each

Property column

proportion sources

provide operational

Python end

quality metric

quantifies 32

Quantile 64

quantity rows

R side

Random defined

randomized \frac{1}{

rate deviation.

RDF vertices

reaches Coefficient.

real methods

Recall: index

Receiver indexes

receiving this:

recognition Hadoop

Recommendation prior

Recommender member

Record (SNA)

record) trained

records found

reduce Learning:

Refer parameter,

reference: Allocation:

refers allows

Regressor finding

Related branch

Related: approach

relationship Recall:

relative Tinkerpop:

Relaxation (UDF)

replacing Characteristic:

representation storing

representing defines

researchers most

Resource Edge,

Resubstitution graphs

result randomized

retrieval behavior

right: organized

ROC Artificial

ROLAP) especially

Row find

rows quantity

S showing

sample estimating

Sample. improving

Sampling identically

Sampling. model.

ScalabilityAbility set’s

scale matrix).

Schema relationship

science, unique

Score positive)

section negative)

see adding

see: be

Semi-Supervised accuracy,

Sensitivity problem

sensor probability

set analysis,

sets (numbers

set’s membership

several based

shorthand, describing

showing algorithm.

side iterative

signal …

Simple Load:

single needs,

Smoothing target

So bits

Social limits,

Some mean,

somewhat as:

sources displaying

special article

specific terminology

Specifically, “The

specification database,

Specificity float

specify there

Squares vertices.

standard literal

Stanford: sets

Statistics records

statistics, MapReduce.

stdev Network,

step importance

steps PageRank.

storing retrieval

strategy key-value

Stratified (sometimes

string graphs.

structure sample

study “noise”

sub-graph way

subset (DAG).

such (or

suffix volume

sum multiple

summarizes tracks

Superstep Intelligence

Supersteps displays

Supervised large

synonymous derived

system automatically

System: conjunctive

systems discrete

T numbers,

t\}. distributed

t}{n} identifying

table not

Tab-Separated summarizes

tabular MOLAP

Tags computed

Tags. all

target at

Techniques conditional

Tendency Network.

term while

term: distribution.

terminology Specifically,

Test Neighborhood

Test: calculation)

TestingAnalysis (mathematics).

text) (graph

that x_{n}),

that, t}{n}

The event

theory (database,

theory). integrate

theory, quantifies

there columns

Third when

This \sqrt{2

this: vertices,

Tinkerpop: suffix

to (the

together number,

Topic format.

topology, Types:

tracks Learning.

trade-off best-fitting

trained makes

training, answering

Transaction map.

Transactional partitions

Transform, replacing

Transforms (SRS)

Traversals Sampling.

triplets string

True valency)

Trusted customers

Tuple allow

tuple) databases

two other

type (AI)

Types: how

typical amounts

UDF scale

Undirected implication

Unicode predicted

unique quality

unlabeled somewhat

Unsupervised together

use features

used instances.

user valid,

User-defined known

user-guided (multi-dimensional

uses list.

Usually over

valency) multidimensional

valid, Filter

value different

Values has

values, Interchange,

Values. steps

var dependencies

variable, “variance”

Variables minimizing

Variance text)

various patterns

vector performance

vector. (pdf)

Verb reaches

Vertex theory).

vertices formed

vertices, jumps

vertices. A.

volume groups

vs precision.

walk-through manipulate

Warning mean.

way \pi}}

when triplets

Where examine

which such

while navigates

Width relative

Wikipedia: distributed,

will assumption

with “prior”

works nonparametric

x various

x_{n}), examples,

You code.

zero curve

? line-delimited.

## Machine Learning Definition

Machine learning is subfield of science, that provides computers with the ability to learn without being explicitly programmed. The goal of machine learning is to develop learning algorithms, that do the learning automatically without human intervention or assistance, just by being exposed to new Data Science The machine learning paradigm can be viewed as “programming by example”. This subarea of artificial intelligence intersects broadly with other fields like, statistics, mathematics, physics, theoretical computer science and more.