Source: Image by JuraHeep from Pixabay

Data Engineering

For deploying big-data analytics, data science, and machine learning (ML) applications in the real world, analytics-tuning and model-training is only around 25% of the work. Approximately 50% of the effort goes into making data ready for analytics and ML. The remaining 25% effort goes into making insights and model inferences easily consumable at scale. The big data pipeline puts it all together. It is the railroad on which heavy and marvelous wagons of ML run. Long-term success depends on getting the data pipeline right.

This article gives an introduction to the data pipeline and an overview of big data architecture…


Photo by Jace & Afsoon on Unsplash

Data Engineering

How do you choose a database? Maybe, you assess whether the use case needs a Relational database. Depending on the answer, you pick your favorite SQL or NoSQL datastore, and make it work. It is a prudent tactic: a known devil is better than an unknown angel.

Picking the right datastore can simplify your application. A wrong choice can add friction. This article will help you expand your list of known devils with an in-depth overview of various datastores. It covers the following:

  • Database parts that define a datastore’s characteristics.
  • Datastores categorized by data types: deep dive into databases for…


Photo by Mr TT on Unsplash

Data Analytics, Notes from Industry

“Let’s collect all data we can, and we will fish for insights later.” Have you heard this before?

That approach seldom works. On rare occasions when it does work a little, the RoI is very low w.r.t. the cost of collecting, processing, and storing volumes of data. Analytics yields better returns when you start with a goal.

Besides, not all analytics are equal. Fun stats are amusing. But actionable insights that can guide you to the next steps are way more valuable.

The Drivetrain Approach offers a systematic way to produce actionable insights. It has four steps:

  1. Define Objective: Start…


Photo by Michael Heng on Unsplash

Software Engineering

The great thing about starting a new project is that you get a clean slate. No baggage of design choices that you hated to look at every day in your last project. But how many times have you seen a shiny new project not turning into the same intractable mess?

It is more likely to happen in a fast-paced startup. The faster the pace, the sooner it happens. So how do you balance moving fast without being trapped in analysis paralysis and keep technical debt at a manageable level?

You design for change. Ignore the refrain that prevention is better…


pythpythonPhoto by Victor Freitas from Pexels

Programming Tips

Applying a function to all rows in a Pandas DataFrame is one of the most common operations during data wrangling. Pandas DataFrame apply function is the most obvious choice for doing it. It takes a function as an argument and applies it along an axis of the DataFrame. However, it is not always the best choice.

In this article, you will measure the performance of 12 alternatives. With a companion Code Lab, you can try it all in your browser. No need to install anything on your machine.

Problem

Recently, I was analyzing user behavior data for an e-commerce app. Depending…


(Image by Slang Labs)

Conversational AI

Bhārat Bhāṣā Stack will catalyze Voice Assistant and Conversational AI innovations for vernacular Indic languages as India Stack did for FinTech.

A decade ago, it was unimaginable.

That one would pay a street vendor in a nondescript small town in India by scanning on mobile a QR code hung on his cart. Even for an amount as little as 50 rupees (less than a dollar).

That there would be many mobile apps and payment wallets from banks and non-banks. All seamlessly interoperable. Any two parties would transact by sharing an email like wallet-address. Without paying any transaction fee.

That myriad…


Photo by Vanessa Bucceri on Unsplash

Microservices

A data model organizes data elements and formalizes their relationships with one another. In database design, data modeling is the process of analyzing application requirements and designing conceptual, logical, and physical data models for storage. However, data storage is only one, albeit an important, aspect of microservices.

There are three related but distinct data models in a microservice for:

  • API Data Model for interactions: It defines the schema of data payload that can be sent to or is received from the endpoints of a microservice. Also known as communication or exchange data model.
  • Object Data Model for computations: It is…


My trek mates and I climbing Mayali Pass in Uttarakhand Himalaya, India

Machine Learning for Developers

You are a Software Engineer. You notice Artificial Intelligence, Machine Learning, Deep Learning, Data Science buzzwords all around. You wonder what these phrases mean, whether all this is for real and useful or is yet another hype and passing fad.

You want to figure out how it is changing or will change the computer/IT industry, and why you should care, if at all. You google about it, you read various articles, blogs, and tutorials. You get some idea but are also overwhelmed by the enormous wealth of math, tools, frameworks you discover.

You wish if someone could give an overview…


Seashells and tree annual rings are nature’s meticulous logs. Image by Friedrich Frühling from Pixabay

Microservices

Nature is a meticulous logger, and its logs are beautiful. Calcium carbonate layers in a seashell are nature’s log of ocean temperature, water quality, and food supply. Annual rings in tree cambium are nature’s log of dry and rainy seasons and forest fires. Fossils in the layers in sedimentary rocks are nature’s log of the flora and fauna life that existed at the time.

In software projects, logs, like tests, are often afterthoughts. But at Slang Labs, we take inspiration from nature’s elegance and meticulousness.

We are building a platform for programmers to make interaction with their mobile and web…


Python Microservices: Build and Test REST endpoints with Tornado

Microservices

At Slang Labs, we are building a platform for programmers to easily and quickly add multilingual, multimodal Voice Augmented eXperiences (VAX) to their mobile and web apps. Think of an assistant like Alexa or Siri, but running inside your app and tailored for your app.

The platform is powered by a collection of microservices. For implementing these services, we chose Tornado because it has AsyncIO APIs. It is not heavyweight. Yet, it is mature and has a number of configurations, hooks, and a nice testing framework.

This blog post covers some of the best practices we learned while building these…

Satish Chandra Gupta

Cofounder @SlangLabs. Ex Amazon, Microsoft Research. I learn, do, and write about Machine Learning in production. Newsletter: http://ML4Devs.com

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store