For deploying big-data analytics, data science, and machine learning (ML) applications in the real world, analytics-tuning and model-training is only around 25% of the work. Approximately 50% of the effort goes into making data ready for analytics and ML. The remaining 25% effort goes into making insights and model inferences easily consumable at scale. The big data pipeline puts it all together. It is the railroad on which heavy and marvelous wagons of ML run. Long-term success depends on getting the data pipeline right.
This article gives an introduction to the data pipeline and an overview of big data architecture…
How do you choose a database? Maybe, you assess whether the use case needs a Relational database. Depending on the answer, you pick your favorite SQL or NoSQL datastore, and make it work. It is a prudent tactic: a known devil is better than an unknown angel.
Picking the right datastore can simplify your application. A wrong choice can add friction. This article will help you expand your list of known devils with an in-depth overview of various datastores. It covers the following:
“Let’s collect all data we can, and we will fish for insights later.” Have you heard this before?
That approach seldom works. On rare occasions when it does work a little, the RoI is very low w.r.t. the cost of collecting, processing, and storing volumes of data. Analytics yields better returns when you start with a goal.
Besides, not all analytics are equal. Fun stats are amusing. But actionable insights that can guide you to the next steps are way more valuable.
The Drivetrain Approach offers a systematic way to produce actionable insights. It has four steps:
The great thing about starting a new project is that you get a clean slate. No baggage of design choices that you hated to look at every day in your last project. But how many times have you seen a shiny new project not turning into the same intractable mess?
It is more likely to happen in a fast-paced startup. The faster the pace, the sooner it happens. So how do you balance moving fast without being trapped in analysis paralysis and keep technical debt at a manageable level?
Applying a function to all rows in a Pandas DataFrame is one of the most common operations during data wrangling. Pandas DataFrame
apply function is the most obvious choice for doing it. It takes a function as an argument and applies it along an axis of the DataFrame. However, it is not always the best choice.
In this article, you will measure the performance of 12 alternatives. With a companion Code Lab, you can try it all in your browser. No need to install anything on your machine.
Recently, I was analyzing user behavior data for an e-commerce app. Depending…
Bhārat Bhāṣā Stack will catalyze Voice Assistant and Conversational AI innovations for vernacular Indic languages as India Stack did for FinTech.
A decade ago, it was unimaginable.
That one would pay a street vendor in a nondescript small town in India by scanning on mobile a QR code hung on his cart. Even for an amount as little as 50 rupees (less than a dollar).
That there would be many mobile apps and payment wallets from banks and non-banks. All seamlessly interoperable. Any two parties would transact by sharing an email like wallet-address. Without paying any transaction fee.
A data model organizes data elements and formalizes their relationships with one another. In database design, data modeling is the process of analyzing application requirements and designing conceptual, logical, and physical data models for storage. However, data storage is only one, albeit an important, aspect of microservices.
There are three related but distinct data models in a microservice for:
You are a Software Engineer. You notice Artificial Intelligence, Machine Learning, Deep Learning, Data Science buzzwords all around. You wonder what these phrases mean, whether all this is for real and useful or is yet another hype and passing fad.
You want to figure out how it is changing or will change the computer/IT industry, and why you should care, if at all. You google about it, you read various articles, blogs, and tutorials. You get some idea but are also overwhelmed by the enormous wealth of math, tools, frameworks you discover.
You wish if someone could give an overview…
Nature is a meticulous logger, and its logs are beautiful. Calcium carbonate layers in a seashell are nature’s log of ocean temperature, water quality, and food supply. Annual rings in tree cambium are nature’s log of dry and rainy seasons and forest fires. Fossils in the layers in sedimentary rocks are nature’s log of the flora and fauna life that existed at the time.
In software projects, logs, like tests, are often afterthoughts. But at Slang Labs, we take inspiration from nature’s elegance and meticulousness.
We are building a platform for programmers to make interaction with their mobile and web…
At Slang Labs, we are building a platform for programmers to easily and quickly add multilingual, multimodal Voice Augmented eXperiences (VAX) to their mobile and web apps. Think of an assistant like Alexa or Siri, but running inside your app and tailored for your app.
The platform is powered by a collection of microservices. For implementing these services, we chose Tornado because it has AsyncIO APIs. It is not heavyweight. Yet, it is mature and has a number of configurations, hooks, and a nice testing framework.
This blog post covers some of the best practices we learned while building these…