Implementing Collaborative Filtering to Recommend Anime

Photo by Yilin Liu on Unsplash

Building recommender systems today requires specialized expertise in analytics, machine learning and software engineering, and learning new skills and tools is difficult and time-consuming. In this notebook, we will start from scratch, covering some basic fundamental techniques and implementations in Python. I build the recommendation system using the collaborative filtering technique. This would help the user to identify the content they like.

Before we get started building the recommendation system, we need to understand the following concepts which we would be using while building the recommendation system -


Implementing Books Recommender Using Weighted Average Technique

Photo by 🇸🇮 Janko Ferlič on Unsplash

A recommender system, or a recommendation system (sometimes replacing ‘system’ with a synonym such as platform or engine), is a subclass of information filtering system that seeks to predict the “rating” or “preference” a user would give to an item

Recommender systems are used in a variety of areas, with commonly recognized examples taking the form of playlist generators for video and music services, product recommenders for online stores, or content recommenders for social media platforms and open web content recommenders. …


Implementing and comparing Bag Of Words and TF IDF to build a model to detect fake news

Photo by Matthew Guay on Unsplash

The advent of the World Wide Web and the rapid adoption of social media platforms (such as Facebook and Twitter) paved the way for information dissemination that has never been witnessed in the human history before. With the current usage of social media platforms, consumers are creating and sharing more information than ever before, some of which are misleading with no relevance to reality.

The following program help in identifying such news articles programmatically if a news article is Fake or Not. Let us first understand the two feature extraction technique I have used to build the model—

  • Bag of…


A Beginner’s Guide To Implement Natural Language Processing Using NLTK & TensorFlow

Photo by NOAA on Unsplash

Twitter has become an important communication channel in times of emergency. The iniquitousness of smartphones enables people to announce an emergency they’re observing in real-time. Because of this, more agencies are interested in programmatically monitoring Twitter (i.e. disaster relief organizations and news agencies).

But, it’s not always clear whether a person’s words are actually announcing a disaster. The following program helps in identifying a tweet programmatically if a tweet conveys disaster info or not.

Before we build the model it is important for us to understand few concepts in NLP (Natural Language Processing)

  • Applying Regular Expressions
    Before we begin ‘text’ processing…


Implemented CNN neural network on Fruit 360 Dataset available on Kaggle

Photo by ja ma on Unsplash

We often face a situation while trying to improve the accuracy of the neural network we end up overfitting the model on the training data. This leads to a poor prediction when we run the model of the test data. Hence I take a dataset and apply these techniques that not only improve the accuracy but also handles the overfitting issues.

In this article, we’ll use the following techniques to train a state-of-the-art model in less than 5 minutes to achieve over 95% accuracy in classifying images from the Fruit 360 dataset :

  1. Data Augmentation
    Data augmentation in data analysis are…


Dataset — CIFAR 10

Photo by Matthew Henry from Burst

While we develop the Convolutional Neural Networks (CNN) to classify the images, It is often observed the model starts overfitting when we try to improve the accuracy. Very frustrating, Hence I list down the following techniques which would improve the model performance without overfitting the model on the training data.

  1. Data normalization
    We normalized the image tensors by subtracting the mean and dividing by the standard deviation of pixels across each channel. Normalizing the data prevents the pixel values from any one channel from disproportionately affecting the losses and gradients. Learn more
  2. Data augmentation
    We applied random transformations while loading images…


Dataset — CIFAR 10 (acc > 75%)

Photo by JJ Ying on Unsplash

In my previous blog, I developed a feed-forward neural network to train on CIFAR 10 dataset. As a feed-forward neural network not being powerful on image dataset. We achieved an accuracy of 50%. I will build a CNN model from scratch and validate its performance on CIFAR 10 dataset. But before we get started I will try answering few fundamental questions.

  1. What is CNN ?
    A Convolutional Neural Network (ConvNet/CNN) is a Deep Learning algorithm which can take in an input image, assign importance (learnable weights and biases) to various aspects/objects in the image and be able to differentiate one from…


Dataset — CIFAR 10

If you are someone who wanted to get started with FFNN (feed forward neural networks)but not quite sure which dataset to pick to begin with, then you are at the right place. We see Neural network implementations in classical machine learning to deep neural networks. Today, neural networks are used for solving many business problems such as sales forecasting, customer research, data validation, and risk management, Let us start by asking couple of fundamental questions —

  1. What is a FFNN ?
    A feedforward neural network is an artificial neural network wherein connections between the nodes do not form a cycle. As…


Build On Dataset — Wheat Seed Species Prediction

Photo by Evi Radauscher on Unsplash

If you are someone who wanted to get started with PyTorch but not quite sure which dataset to pick to begin with, then you are at the right place. We see PyTorch implementations in classical machine learning to deep neural networks. I can’t wait to get started but before we get started, let us start by answering a couple of fundamental questions—

  1. What is PyTorch ?
    PyTorch is an open-source, community-driven deep learning framework developed by Facebook’s artificial intelligence research group. …


Photo by Tim Mossholder on Unsplash

If you are getting started with machine learning and looking for dataset to work with to test you skills and understanding then you are at right place. The Swedish auto insurance dataset is ideal for beginners as the volume of data is low (just 63 records) and you don’t have to do minimal feature engineering to understand its relation with the labels (or the final output).

  1. Introduction
  2. Loading the data
  3. Feature Analysis
  4. Data cleaning
  5. Applying Train, Test and Split
  6. Training on ML model
  7. Cross validation to select best ML model
  8. Model performance

The Swedish Auto Insurance Dataset involves predicting the…

Hargurjeet

Data Science Practitioner | Machine Learning | Deep Learning

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store