A Guide to Build Your First Machine Learning Model and Start Your Data Science Career

栏目: IT技术 · 发布时间: 4年前

A Guide to Build Your First Machine Learning Model and Start Your Data Science Career

Step 1: Complete Kaggle’s course on Machine Learning

Screenshot of Kaggle’s Machine Learning Course

Personally, I found that Kaggle’s Intro to Machine Learning course was the best resource to getting started and the reason being is that it’s VERY basic — it literally provides you with the bare minimum to build your first machine learning model. And you might think that this is a bad thing but as a beginner, but there are a few reasons why this is good:

  • It’s much easier to understand how everything connects when you’re thrown with less information. Taking many small steps triumphs large strides.
  • Confidently building your first machine learning model will give you the motivation to dive deeper into your learnings. Similar to the first point, set many small attainable goals in addition to your big audacious goals.

As you go through this course, keep the following points in mind :

  • Focus on understanding the snippets of code. You’ll be reusing this code later on, so it’s in your best interest if you learn the rationale behind each piece of code. Don’t worry about memorizing the code — there’s nothing wrong with referring back to this as a resource.
  • Don’t stress over the fact that it only covers decision tree and random forest models. You’ll learn later on that you’re only required to change a couple of lines of code to change your machine learning model, so don’t fret.

Once you finish this, you can then move on to the second step to making your own machine learning model:

Step 2: Find a dataset on Kaggle and recreate your random forest model

Kaggle provides much more than online courses — it has thousands of datasets that you can use to explore with and create models with. Below are the steps required to complete your very own first model.

First, go to Kaggle’s list of datasets here and pick one that interests you. Think about what variable you would like to predict. Do you want to predict life expectancy? Real estate prices? Taxi usage? The world is your oyster.

Then click on ‘New Notebook’. This is where you’ll be replicating the code that you learned from Kaggle’s introductory course.

Once you’re in your new notebook, the rest is easy. Simply replicate the code that you were introduced to in Kaggle’s Intro to Machine Learning course. Then there’s only a couple of things that you need to change:

  • Change the csv file that .read_csv() is reading to the dataset that you chose.
  • Change the prediction variable, y , to the variable that you want to predict in the dataset that you chose.
  • Change the features ( the x variables ) that you’ll use to predict the y variables.

And that’s it! You’ve created your very own machine learning model with a dataset that you chose yourself. It may not seem like much right now, but a few more months into your data science journey and you’ll back at this and see how much progress you’ve made.

My First Machine Learning Model

For my first algorithm, I wanted to create something that I thought would be relevant later in my life. I decided to use a “Used Car Dataset” from Kaggle, which has over 600,000 used car listings. The algorithm I created aimed to predict the price of a used car based on a number of features, including the year it was built, the manufacturer , the odometer (number of kilometers), and more.

You can see how I coded my first model here and you can see that it’s not much, but seven weeks later, I was able to improve it significantly with many more steps (see my improved model here ).

Next Steps

There’s a million things that you can learn to improve your model — here are some articles that you can start with if you don’t know what to do next:

Thanks for the read!

If you like my work and want to support me, sign up on my email list here to be the first to hear about new and exclusive content!


以上就是本文的全部内容,希望本文的内容对大家的学习或者工作能带来一定的帮助,也希望大家多多支持 码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

创业头条

创业头条

[美]兰德尔•莱恩(Randall Lane)及《福布斯》杂志编辑部 / 孙莹莹 / 浙江人民出版社 / 2015-6 / 54.90

[内容简介] 全民创业的浪潮中,如何抓住共享经济带来的机遇?没有营收模式还一直烧钱的公司,如何赢得投资人的青睐?一轮死、二轮死、N轮死的魔咒下,怎样才能成功活下来?面对数十亿美元的收购要约,创始人究竟应该如何抉择?没有资金又不懂技术,是否就无法分享互联网创业的红利?《创业头条》一书将为你揭秘上述问题的答案。 阅读《创业头条》一书你会发现,在硅谷最新崛起的互联网亿万富豪身上,有这样一......一起来看看 《创业头条》 这本书的介绍吧!

HTML 编码/解码
HTML 编码/解码

HTML 编码/解码

Markdown 在线编辑器
Markdown 在线编辑器

Markdown 在线编辑器

RGB CMYK 转换工具
RGB CMYK 转换工具

RGB CMYK 互转工具