Week 1: Data Science Basics

You can’t make a great building on a weak foundation.

Shourya Sharma
3 min readMar 3, 2021

Welcome to my course on Artificial Intelligence. If you’re not entirely sure what this is about, I’m embarking on challenge in which I try to create my own curriculum to learn AI through. I’ll be using these weekly blogs to document my process as well as anything important I feel should be included here.

For more detail on the AI course, you can look at my main article which maps out everything I’m currently doing or have done as part of my training on artificial intelligence.

Where do we start?

I’m working through one course as part of my foundation for this topic (Intro to Data Science by Udacity). I’ll be reinforcing my knowledge of python with YouTube tutorials to go along with this course, focusing on extracting and cleaning data to make it useful for the AI. Some of the content covered will overlap and seem repetitive, but I’m of the opinion that it’s good to attempt similar problems multiple times when learning something new as it reinforces what we learn.

Udacity Intro to Data Science

Intro to Data Science

My aim on these medium blogs is to provide a general sense of what I have learnt from this module.

Learning Objectives:

  • Gain a basic understanding of python and the relevant data science libraries
  • Build a theoretical foundation of statistical and machine learning models
  • Gain an understanding of how to visualise different data types

While attempting to complete this course, I made notes using notion.so as it allows me to track what lesson I’m currently on, and create to do lists. I’ve provided a link below in the event someone wishes to follow along the course, and finds they may need help with the programming.

EDIT:

Due to other time commitments, I was unable to complete my notes. Answers are only available until lesson 5.

TwitchChess YouTube Tutorial

To practice the data handling skills I’ve learnt from the course, I’ve been following George Hotz’s TwitchChess programming series on YouTube. I’ve embedded a link to part 1 of the series below.

The content shown in part 1 of this series is helpful in understanding how we can use python libraries such as numpy to convert datasets (in this case around 6 million chess moves) into something that is easily understood by a training model.

UPDATE:

I’ve found that even while following along line for line, my version of twitchchess wasn’t as accurate, and I was struggling to understand where I’d gone wrong. This tells me that I need to work on gaining an understanding of the mathematics used, such as why a tanh curve was better suited in this tutorial than a sigmoid curve, and how exactly a CNN actually works. My next article will therefore focus on understanding neural networks, which is why I’ve enrolled in Coursera’s Neural Networks and Deep Learning Course.

--

--