Home Data DIY

Data DIY

Installing MongoDB on Ubuntu

If your company is in the business of using, handling or depending on data, chances are you’re in need of a document-oriented, NoSQL database....

Calculating Word Frequency in Dataframe

Alright so in the short tutorial we'll calculate word frequency and visualize it.It's relatively simple task. BUT when it comes for stopwords and language different from...

Finding the Best Selling Product on Amazon

There’s no doubt that in order to make a decent profit on Amazon, it is essential to choose the best product to sell. To...

Python List Comprehension

When doing data science, you might find yourself wanting to read lists of lists, filtering column names, removing vowels from a list, or flattening...

Delivering Data through External Sources

A common step when working with BigML is extracting data from a database or document repository for uploading as...

Tutorial on Linux server port traffic

Every network administrator needs to know how to listen to port traffic on a server. Here's one way to...

Types of Data Scraping

There is a lot of data presented in a table format inside the web pages. However, it could be...

Installing AWX Ansible web GUI on CentOS 8

Ansible administration is most often done from the command line. Make that task a bit more efficient with the...

Imbalanced adult income classification dataset

Many binary classification tasks do not have an equal number of examples from each class, e.g. the class distribution is skewed or imbalanced. A popular...

Tutorial on imbalanced Classification with Dataset of Fraudulent Credit Card Transactions

Fraud is a major problem for credit card companies, both because of the large volume of transactions that are...

Tutorial on imbalanced multiclass with E.coli dataset

Multiclass classification problems are those where a label must be predicted, but there are more than two labels that may be predicted. These are challenging...

Installing R on your desktop

This is a beginner guide that is designed to save yourself a headache and valuable time if you decide...

Fundamentals of Delta Lake

You might be hearing a lot about Delta Lake nowadays. Yes, it is because of it’s introduction of new...

Applying Elasticsearch on Kubernetes

Big data, AI, machine learning, and numerous others are all buzzwords we seem to throw around lightly in recent years. Even though...

A beginner’s guide to Data Science

“Data! Data! Data!” he cried impatiently. “I can’t make bricks without clay.”

A beginner's guide to Data Science

“Data! Data! Data!” he cried impatiently. “I can’t make bricks without clay.”