Machine learning is a data analytics technique that teaches computers to do what comes naturally to humans and animals: learn from experience. Machine learning algorithms use computational methods to directly "learn" from data without relying on a predetermined equation as a model.
As the number of samples available for learning increases, the algorithm adapts to improve performance. Deep learning is a special form of machine learning.
How does machine learning work?
Machine learning uses two techniques: supervised learning, which trains a model on known input and output data to predict future outputs, and unsupervised learning, which uses hidden patterns or internal structures in the input data.
Supervised learning
Supervised machine learning creates a model that makes predictions based on evidence in the presence of uncertainty. A supervised learning algorithm takes a known set of input data and known responses to the data (output) and trains a model to generate reasonable predictions for the response to the new data. Use supervised learning if you have known data for the output you are trying to estimate.
Supervised learning uses classification and regression techniques to develop machine learning models.
Common regression algorithms include linear, nonlinear models, regularization, stepwise regression, boosted and bagged decision trees, neural networks, and adaptive neuro-fuzzy learning.
Using supervised learning to predict heart attacks
Physicians want to predict whether someone will have a heart attack within a year. They have data on previous patients, including age, weight, height, and blood pressure. They know if previous patients had had a heart attack within a year. So the problem is to combine existing data into a model that can predict whether a new person will have a heart attack within a year.
Detects hidden patterns or internal structures in unsupervised learning data. It is used to eliminate datasets containing input data without labeled responses.
Clustering is a common unsupervised learning technique. It is used for exploratory data analysis to find hidden patterns and clusters in the data. Applications for cluster analysis include gene sequence analysis, market research, and commodity identification.
Common algorithms for performing clustering are k-means and k-medoids, hierarchical clustering, Gaussian mixture models, hidden Markov models, self-organizing maps, fuzzy C-means clustering, and subtractive clustering.
Ten methods are described and it is a foundation you can build on to improve your machine learning knowledge and skills:
Regression
Classification
Clustering
Dimensionality Reduction
Ensemble Methods
Neural Nets and Deep Learning
Transfer Learning
Reinforcement Learning
Natural Language Processing
Word Embedding's
1. Regression
Regression methods fall under the category of supervised ML. They help predict or interpret a particular numerical value based on prior data, such as predicting an asset's price based on past pricing data for similar properties.
2. Classification
In another class of supervised ML, classification methods predict or explain a class value. For example, they can help predict whether an online customer will purchase a product. Output can be yes or no: buyer or no buyer. But the methods of classification are not limited to two classes.
3. Clustering
We fall into untrained ML with clustering methods because they aim to group or group observations with similar characteristics. Clustering methods do not use the output information for training but instead let the algorithm define the output
4. Dimensionality Reduction
We use dimensionality reduction to remove the least important information (sometimes unnecessary columns) from the data setFor example, and images may consist of thousands of pixels, which are unimportant to your analysis.
5. Ensemble Methods
Imagine that you have decided to build a bicycle because you are not happy with the options available in stores and online. Once you've assembled these great parts, the resulting bike will outlast all other options.
6. Neural networks and deep learning
Unlike linear and logistic regression, which is considered linear models, neural networks aim to capture nonlinear patterns in data by adding layers of parameters to the model.
Let's say you are a data scientist working in the retail industry. You've spent months training a high-quality model to classify images as shirts, t-shirts, and polos. Your new task is to create a similar model to classify clothing images like jeans, cargo, casual, and dress pants.
8. Reinforcement Learning
Imagine a mouse in a maze trying to find hidden pieces of cheese. At first, the Mouse may move randomly, but after a while, the Mouse's feel helps sense which actions bring it closer to the cheese. The more times we expose the Mouse to the maze, the better at finding the cheese.
9. Natural Language Processing
A large percentage of the world's data and knowledge is in some form of human language. For example, we can train our phones to autocomplete our text messages or correct misspelled words.
10. Word Embedding
TFM and TFIDF are numerical representations of text documents that consider only frequency and weighted frequencies to represent text documents. In contrast, word embedding can capture the context of a word in a document.
Summary
Studying these methods thoroughly and fully understanding the basics of each can serve as a solid starting point for further study of more advanced algorithms and methods.
Important Links
Comments
Post a Comment