What’s the difference between all these?

Everyone with some basic knowledge of programming knows what is meant by iterating. For example, if we had an object with a gigantic list of random numbers, iterating through the list means to go through each value in the list. We can do this using either a for loop or a while loop like below:

list_example = [1,21,3,50,66,7,90,31,236,3400]#for loop
for element in list_example:
print(element)
#while loop
i = 0
while i < len(list_example)
print(list_example[i])
i += 1

Both of these examples are iterating through the list, but the while loop needs an index variable. Using the while loop is…


Nothing is certain when it comes to trading, but it is certain that there is money to be made. Jane Street is well known for using technology to do trades and has challenged the Kaggle community to create a model that can identify trading opportunities. Jane Street is well known for making their own trading models and use of machine learning to capitalize on the inefficiencies of the market, and now I want to go through the experience of trying to do the same!

The data we are given has 100+ features of data that represents a trade. Each trade…


Introduction to Python Programming by learning the Essentials

Nowadays, Python is everywhere, and there’s obviously a reason to that! Python has been an object oriented programming languages since its inception and is not just a scripting language! Python is especially powerful for in development in that there is no compiling after writing code. By this I mean Python goes from reading, evaluating and printing as opposed to writing, compiling, testing, and recompiling. Now that you’ve been convinced to use Python, time to learn from the basics.

Variables and Types

A variable in Python is a little different from other languages in that it doesn’t store values/objects, rather acts as a pointer…


A review for me and maybe for you too!

Introduction

SQL stands for Structured Query Language. It is a common language for analysts, data engineers, data scientists or really anyone who works with data. It’s easy to pick up and important to know about, but gaining actual experience can range in trying to work with the language yourself with 1,000 rows and a few tables or being at a full-time position working with millions of rows. The point is no matter the number of rows, the queries would be very similar.

Structure

MySQL, MS SQL Server, Oracle, etc are different types of…


Principal Component Analysis, commonly known as PCA, is commonly used for dimensionality reduction. If we had a data set with maybe four or five features, we could easily plot all the variables against each other in pair plots. We’d be able to observe the pairs and see if there’s any possible correlations, but what if we had hundreds of rows? What if we wanted to compare combinations of variables? It’d would be much harder, thereby giving us the need to do a dimensionality reduction of some sort.

The simplest explanation that I could think of for PCA is to look…


Exploring job postings to find popular buzzwords

Data Science is a very popular phrase, but much too broad. When I learned data science in a boot camp, I was taught programming in Python, statistics, machine learning and working with big data. I essentially learned the workflow of the data science and got an amazing amount of knowledge out of it. After finishing the boot camp with stronger skill sets, it was of course time to find a job to get some professional experience, but before that I needed to craft a great resume.

With not much experience in the tech field behind me, I was told I…


Unsupervised learning method of cluster analysis, Hierarchical Clustering Analysis (HCA). Part 2 of 2 of Clustering Analysis

What is HCA Clustering?

To remind you from the previous post, clustering analysis is an unsupervised method or technique for breaking down data into groups/clusters. We aren’t predicting any labels, but rather finding ways to make groups different from the way we do in k-Means.

As a refresher, k-Means was picking the number of starting points to reflect the number groups we want, assigning all points to the closest points, finding the mean of the distances to be a new “starting point” and repeating the previous two steps. Remember, for k-Means, there’s a downside of potentially picking too many or few groups/starting points. …


Unsupervised learning method of cluster analysis, k-Means. Part 1 of 2 of clustering analysis

What is Clustering?

Clustering analysis is an unsupervised method or technique for breaking down data into groups/clusters. Like most unsupervised learning, clustering is a great method for exploring a new data set in hopes of finding some type of pattern or underlying structure. This is unsupervised because we aren’t predicting any labels, but rather finding ways to make groups. (Great article on unsupervised vs. supervised learning).

The two types of clustering that I’m familiar with is k-Means and Hierarchical Clustering (HCA) and work in different ways. I’ll first go over the K-Means method

K-Means Clustering

This method of clustering sounds very much like our k-Nearest…


Which U.S. state has the the lowest avocado prices?

Motivation

In this day and age, who doesn’t like avocado? Probably just only those who are allergic, but for those who aren’t there’s always the cost problem. Whether you are an individual or small business, where can one find the cheapest avocados?

Another student and I collaborated together to do a time series analysis to answer this question.

Data

We first got our data from a Kaggle source posted two years ago by Justin Kiggins. He has this amazing quote really highlighting his motivations.

“It is a well known fact that Millenials LOVE…


Image from Wikipedia

Everyone’s looking for someone that knows SQL. It’s easy to understand since SQL reads like plain English, but it’s all the practice. I, like many others, did a lot of practice on HackerRank’s easy and medium problems, but I never had too much experience making my own database or seeing actual SQL statements used. I got the chance to use SQLite after finally dealing with some projects that dealt with a good amount of data. …

Richard Mei

Learning Data Science

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store