Apache Spark is the largest open source project in data processing. It is extremely fast and widely used throughout data science teams. I will show you how to install Spark in standalone mode on Ubuntu 16.04 LTS.
The tensorflow object detection api is a great tool for performing YOLO object detection. This api comes ready to use with pretrained models which will get you detecting objects in images or videos in no time.
In this article you will learn how to install the Tensorflow Object Detection API in Windows.
Tensorflow is a popular machine learning library released by Google. In this article, we will create a neural network in Tensorflow to classify the Iris species and will train the network utilizing Stochastic Gradient Descent.
This financial company was extremely successful at implementing advanced analytics in their organization. Reaping benefits that exceeded 200x their costs, read more to find what was their blueprint for analytical success.
Google Colaboratory is a cloud based tool for machine learning research and colaboration built on top of jupyter. Build, share and colaborate on your machine learning projects easily with Colaboratory.
Scikit Learn python library provides built in classes to perform one-hot encoding transformations to your categorical features in order to utilize them in machine learning models.
Activation functions provide neural networks with the ability to learn complex non linear relationships. Read along to find out the top 3 activation functions used in neural networks.
In this blog post, we will build our own Twitter Search class to search for tweets using Python. By doing so, we will gain a better understanding of Twitter’s API and REST interfaces in general.
Sentiment analysis refers to the use of text analytics, natural language processing among other techniques to automatically identify the writers attitude towards a given product, service or topic. In this blog post we will use a set of available API’s from Microsoft cognitive services to perform sentiment analysis using python.
Are you trying to implement a machine learning algorithm to classify documents? Need to determine the intent of a sentence to use in a chatbot? You might be asking yourself the same question. How do I convert text into a form that my machine learning algorithm can use? In the following post we will go over a simple to use model to convert sentences into vectors called the Bag of Words model. We will implement this algorithm in python from scratch and then we will use Scikit learns built in functions to vectorize sentences.
Feature selection is an important part of building machine learning models. As the saying goes, garbage in garbage out. Training your algorithms with irrelevant features will affect the performance of your model. Also known as variable selection or attribute selection, choosing or engineering new features is often what separates the best performing models from the rest.
Building neural networks is a complex endeavor with many parameters to tweak prior to achieving the final version of a model. On top of this, the two most widely used numerical platforms for deep learning and neural network machine learning models, TensorFlow and Theano, are too complex to allow for rapid prototyping. The Keras Deep Learning library for Python helps bridge the gap between prototyping speed and the utilization of the advanced numerical platforms for deep learning. Keras is a high-level API for building neural networks that run on top of TensorFlow, Theano or CNTK. It allows for rapid prototyping, supports both recurrent and convolutional neural networks and runs on either your CPU or GPU for increased speed.
If you have been using machine learning, you will sooner rather than later realize that machine learning algorithms require numerical inputs. Unlucky for us, our features will come in various forms. Some will be continuous, others categorical in numeric or text format. Machine learning algorithms cannot work with variables in text form, we must perform certain preprocessing steps to get our data in the right format. How do we deal with these categorical variables? Worry no more! In this blog post I will explain how to deal with these categorical variables by using a technique known as one hot encoding.
Probability distributions are a powerful tool to use when modeling random processes. They are widely used in statistics, simulations, engineering and various other settings. I have had to use them in various projects to correctly model randomness. There are many probability distributions to choose, from the well-known normal distribution to many others such as logistic and Weibull. The common problem I have continuously faced is having an easy to use tool to quickly fit the best distribution to my data and then use the best fit distribution to generate random numbers. Once again Python shows its flexibility for data science with its SciPy package, one of the main Python packages for mathematics, science and engineering. We will be using the SciPy package to tackle this task.
In the previous post we discussed the theory and history behind the perceptron algorithm developed by Frank Rosenblatt. Even though this is a very basic algorithm and only capable of modeling linear relationships, it serves as a great starting point to understanding neural network machine learning models. In this post, we will implement this basic Perceptron in Python.
The perceptron is a supervised learning algorithm used for binary classification. It is one of the oldest algorithms used in machine learning going back to the 1950’s which has been the inspiration to many state of the art algorithms used today.
Machine learning is big, its growing and it’s here to stay.