Skip to main content

Deep Learning Project


A Gentle Introduction to Siamese Neural Networks Architecture 


What are Siamese Neural Networks?

Siamese Neural Networks, or SNNs, is one of the most popular neural network architectures that use this strategy and can predict multiple classes from very little data. This ability has made Siamese neural networks very popular in real-world applications in security, face recognition, signature verification, and more.


So how does the neural network architecture of Siamese networks make this possible?


Siamese Neural Networks: An Overview

A Siamese network consists of two or more identical subnetworks: neural networks with the same architecture, configuration, and weights. Even during training, parameter updates happen simultaneously for both neural networks with the same weights.


The purpose of having identical subnetworks is to train the model based on a similarity function that measures how different the feature vectors of one image are from the other. Because of this architecture, the model can be trained without much data.




Why use Siamese Neural Networks ?

With siamese neural networks, the common class imbalance problem can be addressed since the network does not need too many samples for a given class in the training data.


Moreover, a new class can be added without training the entire network from scratch after the siamese neural network has been trained and deployed. The model trains by learning how similar or dissimilar image pairs are, samples from a new class can be added to the trained siamese network, and training can be resumed since the network architecture will compare the new images with the rest of the classes and update the weights and the fully connected layer.


This behaviour is unique to a network architecture that uses one-shot learning since other categories of neural networks would have to be trained from scratch on a large, class-balanced dataset for significant performance.


But how does a siamese network learn from such a small set of samples? Let's look at the architecture and how the training process in siamese neural networks works.


Siamese Neural Network Architecture Explained

As described above, the architecture below shows two identical subnetworks that make up a siamese neural network. Feature vectors from both networks are compared using a loss function L. There are two strategies for training the siamese network using different loss functions.


First, the feature vectors of similar and dissimilar pairs should be descriptive, informative, and distinct enough from each other so that segregation can be learned effectively.


And secondly, the feature vectors of similar image pairs should be similar enough, and those for dissimilar pairs should be dissimilar enough so that the model can quickly learn semantic similarity.


To make sure the model can learn these feature vectors quickly, the loss function should incentivize both learning the similarity and dissimilarity of things heavily enough. Here is where the siamese neural network strategy helps - by comparing one image with all the other images, the model learns what "similar" is and how to define and recognize dissimilar pairs.


To gain this kind of information, the cross-entropy loss cannot help as it works on a class prediction basis. Mean squared errors also do not give enough information needed for our goal. The most commonly used loss functions are a Contrastive loss function and a Triplet loss function. Let's look at each of them in detail.


Contrastive Loss Function

The contrastive loss function is a distance-based loss function that updates weights such that two similar feature vectors have a minimal Euclidean distance. In comparison, the distance is maximized between two different vectors.


In the equation shown below, y represents whether or not the vectors are dissimilar, and Dw is the Euclidean distance between the vectors. When the vectors are dissimilar (y=1), the loss function minimizes the second term -- for which Dw must be maximized (encourage more distance between dissimilar vectors). We want these vectors to have a distance of more than at least m, and we avoid computation if the vectors are already m units apart by defaulting to 0.


Similarly, if the vectors are similar (y=0), the loss function must minimize Dw.


Contrastive Loss Function in Siamese Neural Networks


However, because of the binary nature of this function to bring the vectors either close or far from each other, we cannot learn how similar two vectors are to each other. Thus, another loss function helps us learn both similarity score and dissimilarity in a better way.


Triplet Loss Function in Siamese Network

By using triplet loss, we can tell how similar an image looks to the others (within or outside its class) when compared. The siamese network learns the similarity ranking using the score computed in this fashion.


For this, the loss is computed by comparing a given image (called anchor image) with a positive image (which is similar to the anchor image) and a negative image (which is dissimilar to it). Computing the intra-distance for each of these pairs, the model knows what similarity looks like and how different the given image must be from the other classes.


So, in the equation below, f(A) is the anchor image, and f(P) and f(N) is the positive image and negative image, respectively. Again, for the loss function to minimize the RHS, the term with f(N) would have to be maximized and that with f(P) minimized. This aligns with the strategy that we want similar pairs closer and dissimilar pairs further apart. α is just a regularizing parameter.


Triplet Loss Function in Siamese Networks


Read here for further explanation on the Triplet Loss Function in Siamese Networks


Pros and Cons of Siamese Neural Networks

As we saw when getting introduced to siamese neural networks they offer many benefits over conventional CNNs in certain specific tasks.


Advantages of Siamese Network

Semantic Similarity: Firstly, siamese networks do not learn from training errors or mispredictions but from semantic similarity. This encourages the model to learn better and better embeddings that represent images from the support set and bring related concepts close in the feature space. By learning such a feature space, similar to how textual models learn word embeddings, the model learns concepts and attempts to understand why certain images are more similar than others instead of just extracting static features using convolutions.


Class Imbalance: The biggest benefit directly applicable to the real world is the capability of giving benchmark performance on very little data. With the data requirement reduced, the problem of class imbalance also vanishes.


Siamese Neural Network for Face Recognition 

Face recognition is nothing but another image recognition or classification task. One-shot learning is particularly applicable to this task because it is impossible to have sufficient samples of one person's face (one label) in practical cases. Face recognition is often used as an attendance system or security measure to restrict access to buildings and offices to employees only.


In this case, not only is it impractical to get many images of one person to get a decent success rate but adding access to an incoming new employee would mean training the entire CNN from scratch and risking the existing performance.


Siamese Neural Network for Image Classification 

Signature verification is a commonly found use of image classification in the context of one-shot learning. A signature verification system checks the authenticity of a given signature against the one existing in a dataset. Based on the sign's similarity, the sample can be classified as real or fake. With this task widely prevalent in banks and financial institutions worldwide, Siamese networks quickly became the go-to solution for this otherwise manually laborious task.


Is a Siamese Network Supervised?

Yes, siamese networks are trained in a supervised fashion. It needs labeled information to know whether the images it compares are similar. However, one can also tune siamese networks to learn in a self-supervised (SSL: self-supervised learning) manner.


Important Links

Home Page 

Courses Link  

  1. Python Course  

  2. Machine Learning Course 

  3. Data Science Course 

  4. Digital Marketing Course  

  5. Python Training in Noida 

  6. ML Training in Noida 

  7. DS Training in Noida 

  8. Digital Marketing Training in Noida 

  9. Winter Training 

  10. DS Training in Bangalore 

  11. DS Training in Hyderabad  

  12. DS Training in Pune 

  13. DS Training in Chandigarh/Mohali 

  14. Python Training in Chandigarh/Mohali 

  15. DS Certification Course 

  16. DS Training in Lucknow 

  17. Machine Learning Certification Course 

  18. Data Science Training Institute in Noida

  19. Business Analyst Certification Course 

  20. DS Training in USA 

  21. Python Certification Course 

  22. Digital Marketing Training in Bangalore

  23. Internship Training in Noida

  24. ONLEI Technologies India

  25. Python Certification

  26. Best Data Science Course Training in Indore

  27. Best Data Science Course Training in Vijayawada

  28. Best Data Science Course Training in Chennai

  29. ONLEI Group

  




Comments

Popular posts from this blog

ONLEI Technologies Reviews by Somya

  ONLEI Technologies Reviews by Somya When I first started my career journey, I was filled with doubts and confusion. I wanted to move into the IT field but didn’t know where to begin. That’s when I discovered ONLEI Technologies Review s , and today, I can proudly say it was the best decision I made. ONLEI Technologies provides not just training but real industry exposure. The mentors guided me step by step, from building my basics in Python, SQL, and Power BI, to preparing for interviews with real-world projects. What impressed me the most was their personalized support – they don’t just teach, they make sure you become job-ready . After completing my course, I appeared for multiple interviews and finally landed a great job with an attractive package. This would not have been possible without the constant motivation and practical guidance I received from the team at ONLEI. If anyone is looking for genuine skill-building and career growth, my advice is simple – trust ONLEI Technol...

Machine Learning Techniques

Machine learning is a data analytics technique that teaches computers to do what comes naturally to humans and animals: learn from experience. Machine learning algorithms use computational methods to directly "learn" from data without relying on a predetermined equation as a model. As the number of samples available for learning increases, the algorithm adapts to improve performance. Deep learning is a special form of machine learning . How does machine learning work ? Machine learning uses two techniques: supervised learning, which trains a model on known input and output data to predict future outputs, and unsupervised learning, which uses hidden patterns or internal structures in the input data. Supervised learning Supervised machine learning creates a model that makes predictions based on evidence in the presence of uncertainty. A supervised learning algorithm takes a known set of input data and known responses to the data (output) and trains a model to generate reason...

What Does a Data Science do ?

  The past few years have been revolutionary in the history of marketing (digital and traditional), with new and enthralling trends captivating the likes of industry leaders.   Data science , data analytics, machine learning, artificial intelligence, digital marketing, etc., are some of the recent marketing trends that have created waves in the industry with their peculiar characteristics and scope. Data science , particularly, has piqued the attention of brand leaders reason of which several brand leaders are planning to incorporate the concept into their marketing and promotional campaign. It is believed that many brands are employing more and more skilled and experienced Data scientists and analytics. What is  Data Science ? Data science is summarized by data gathering, analysis, and interpretation, among others. It is a field of study that combines mathematical and statistical methods to collect and interpret data, which then can be used to solve business problems. Ma...