A Gentle Guide to the complexities of model deployment, and integrating with the enterprise application and data pipeline. What the Data Scientist, Data Engineer, ML Engineer, and ML Ops do, in Plain English.

Photo by Ricardo Rocha on Unsplash

Let’s say we’ve identified a high-impact business problem at our company, built an ML (machine learning) model to tackle it, trained it, and are happy with the prediction results. This was a hard problem to crack that required much research and experimentation. So we’re excited about finally being able to use the model to solve our user’s problem!

However, what we’ll soon discover is that building the model itself is only the tip of the iceberg. The bulk of the hard work to actually put this model into production is still ahead of us. …


A Gentle Guide to the lifecycle of a Machine Learning project in the Enterprise, the roles involved and the challenges of building models, in Plain English

Photo by Greg Rakozy on Unsplash

What is Enterprise ML?

What does it take to deliver a machine learning (ML) application that provides real business value to your company?

Once you’ve done that and proved the substantial benefit that ML can bring to the company, how do you expand that effort to additional use cases, and really start to fulfill the promise of ML?

And then, how do you scale up ML across the organization and streamline the ML development and delivery process to standardize ML initiatives, share and reuse work and iterate quickly?

What are the best practices that some of the world’s leading tech companies have adopted?


A Gentle Guide to how the Attention Score calculations capture relationships between words in a sequence, in Plain English.

Photo by Olav Ahrens Røtne on Unsplash

Transformers have taken the world of NLP by storm in the last few years. Now they are being used with success in applications beyond NLP as well.

The Transformer gets its powers because of the Attention module. And this happens because it captures the relationships between each word in a sequence with every other word.

But the all-important question is how exactly does it do that?

In this article, we will attempt to answer that question, and understand why it performs the calculations that it does.

I have a few more articles in my series on Transformers. In those articles…


A Gentle Guide to the reasons for the Batch Norm layer’s success in making training converge faster, in Plain English

Photo by AbsolutVision on Unsplash

The Batch Norm layer is frequently used in deep learning models in association with a Convolutional or Linear layer. Many state-of-the-art Computer Vision architectures such as Inception and Resnet rely on it to create deeper networks that can be trained faster.

In this article, we will explore why Batch Norm works and why it requires fewer training epochs when training a model.

You might also enjoy reading my other article on Batch Norm which explains, in simple language, what Batch Norm is and walks through, step by step, how it operates under the hood.

And if you’re interested in Neural…


A Gentle Guide to boosting model training and hyperparameter tuning with Optimizers and Schedulers, in Plain English

Photo by Tim Mossholder on Unsplash

Optimizers are a critical component of neural network architecture. And Schedulers are a vital part of your deep learning toolkit. During training, they play a key role in helping the network learn to make better predictions.

But what ‘knobs’ do they have to control their behavior? And how can you make the best use of them to tune hyperparameters to improve the performance of your model?

When defining your model there are a few important choices to be made — how to prepare the data, the model architecture, and the loss function. …


A Gentle Guide to an all-important Deep Learning layer, in Plain English

Photo by Reuben Teo on Unsplash

Batch Norm is an essential part of the toolkit of the modern deep learning practitioner. Soon after it was introduced in the Batch Normalization paper, it was recognized as being transformational in creating deeper neural networks that could be trained faster.

Batch Norm is a neural network layer that is now commonly used in many architectures. It often gets added as part of a Linear or Convolutional block and helps to stabilize the network during training.

In this article, we will explore what Batch Norm is, why we need it and how it works.

You might also enjoy reading my…


A Gentle Guide to two essential metrics (Bleu Score and Word Error Rate) for NLP models, in Plain English

Photo by engin akyurt on Unsplash

Most NLP applications such as machine translation, chatbots, text summarization, and language models generate some text as their output. In addition applications like image captioning or automatic speech recognition (ie. Speech-to-Text) output text, even though they may not be considered pure NLP applications.

How good is the predicted output?

The common problem when training these applications is how do we decide how ‘good’ that output is?

With applications like, say, image classification the predicted class can be compared unambiguously with the target class to decide whether the output is correct or not. However, the problem is much trickier with applications where the output is a sentence.


An end-to-end example using Encoder-Decoder with Attention in Keras and Tensorflow 2.0, in Plain English

Photo by Max Kleinen on Unsplash

Generating Image Captions using deep learning has produced remarkable results in recent years. One of the most widely-used architectures was presented in the Show, Attend and Tell paper.

The innovation that it introduced was to apply Attention, which has seen much success in the world of NLP, to the Image Caption problem. Attention helped the model focus on the most relevant portion of the image as it generated each word of the caption.

In this article, we will walk through a simple demo application to understand how this architecture works in detail.

I have another article that provides an overview…


A Gentle Guide to Image Feature Encoders, Sequence Decoders, Attention, and Multi-modal Architectures, in plain English

Photo by Brett Jordan on Unsplash

Image Captioning is a fascinating application of deep learning that has made tremendous progress in recent years. What makes it even more interesting is that it brings together both Computer Vision and NLP.

What is Image Captioning?

It takes an image as input and produces a short textual summary describing the content of the photo.


A Gentle Guide to Feature Engineering and Visualization with Geospatial data, in Plain English

Photo by Daniel Olah on Unsplash

Location data is an important category of data that you frequently have to deal with in many machine learning applications. Location data typically provides a lot of extra context to your application’s data.

For instance, you might want to predict e-commerce sales projections based on your customer data. The machine learning model might be able to identify more accurate customer buying patterns by also accounting for the customer location information. This would become all the more important if this was for a physical site (rather than online) such as retail stores, restaurants, hotels, or hospitals.

Ketan Doshi

Machine Learning and Big Data

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store