Python Fundamentals for Data Science

栏目: IT技术 · 发布时间: 4年前

Python Fundamentals for Data Science

Essentials of fundamental python programming to get started with Data Science.

Python Fundamentals for Data Science

This is a 3-part series covering all the fundamentals of Python for Data Science

Beginners in the field of Data Science who are not familiar with programming often experience a hard time figuring out what the right starting point should be. With hundreds of questions on how to get started with Python programming for DS on various forums, this post along with the video series is my attempt to settle all those questions.

I have been a Python evangelist who has started off his career as a Full Stack Python Developer hopping on to Data Engineering and then Data Science. My prior experience with Python and a decent grasp over Maths made the switch to Data Science comfortable for me. So, here are the fundamentals to help you with programming in Python.

Before deep dive into the essentials, make sure that you have set up the Python environment and know how to use Jupyter Notebooks(optional).

The basic Python curriculum can be broken down into 4 essential topics that include:

  1. Data Types(Int, Float, Strings)
  2. Compound data structures(Lists, Tuples, and Dictionaries)
  3. Conditionals, Loops, and Functions
  4. Object-Oriented programming and using external libraries

Let’s quickly go over each one of them to see what is important to learn while covering fundamentals and what you’ll learn over time with experience.

1. Data Types and Structures

The very first step is to understand how Python interprets a variety of data. Starting with widely used data types, we should be familiar with Integer(int), Floats(float), strings(str), and boolean(bool). What should you practice

Type, typecasting, and I/O functions:

  • Learning the type of data using the type() method.
Python Fundamentals for Data Science
a = 5.67
Python Fundamentals for Data Science

Python Fundamentals for Data Science

Converting a string “55” into Integer 55 and conversion throws a value error when the casting isn’t possible.

Once you are familiar with the basic data types and their usage, we can now focus on the arithmetic operators and expression evaluations(DMAS) and you can store the result in a variable for further use.

Python Fundamentals for Data Science

Strings:

You would be required to deal with textual data and strings and their operators come in very handy while dealing with the string data type. Practice these concepts:

  • Concatenating strings using + .
  • Splitting and joining the string using the split() and join() method.
  • Changing the case of the string using lower() and upper() methods.
  • Working with substrings of a string

Here’s the Notebook that covers all the points discussed.

2. Compound data structures(Lists, Tuples, and Dictionaries)

Lists and Tuples(compound data types):

One of the most commonly used and important data structures is Python lists which will pave the way for computing algebraic equations and statistical models on your array of data. A list is a collection of elements, the collection can be of the same or varied data types. Concepts to be familiar with:

  • Multiple data types can be stored in a python list.
  • Indexing and slicing to access a specific element or sub-list of the list.
  • Helper methods for sorting, reversing, deleting elements, copying, and appending.
  • Nested lists — lists containing lists. Ex: [1,2,3, [10,11]]
  • Adding and extending lists.
Multiplying a scalar and adding a list to another list.

Tuplesare an immutable ordered sequence of items. These are similar to lists but the key difference is that they are immutable whereas lists are mutable. The concepts to focus on:

  • Indexing and slicing(similar to lists)
  • Nested tuples.
  • Adding tuples and helper methods like count() and index()

Dictionaries

These are another type of collection in Python. The lists are simple integer indexed, dictionaries are more like addresses. We have key-value pairs in a dictionary. Keys are analogous to indexes in lists.

Python Fundamentals for Data Science
Representation of a dict as key-value pairs

To access an element, you need to pass the key in squared brackets.

Python Fundamentals for Data Science
Accessing the value by passing in the key

Concepts to focus on:

  • Iterating a dictionary(would be covered in loops)
  • Using helper methods like get, pop, items, keys, update, etc

Notebook for the above topics can be found here .

3. Conditionals, Loops, and Functions

Conditions and Branching

We discussed boolean data types(True/False) in the first section, Python uses these boolean variables to assess conditions. Whenever there is a comparison or evaluation needed to be done, the boolean values are the resulting solution.

Python Fundamentals for Data Science

The comparison in the image needs to be observed carefully as people confuse the assignment operator(single equal sign =) with the comparison operator(double equal sign==).

Boolean operator(or, and, not)

These are used to evaluate complex assertions together.

OR — One of the many comparisons should be true for the entire condition to be true.

AND — All of the comparisons should be true for the entire condition to be true.

NOT — Checks for the opposite of the comparison specified.

Python Fundamentals for Data Science
Python Fundamentals for Data Science

Concepts to learn:

  • IF, ELSE, and ELIF statements to construct your condition.
  • Making complex comparisons in one condition.
  • Keeping indentation in mind while writing nested if/else statements.
  • Using boolean, “in”, “is” and “not” operators.

Loops

You will often need to do a task repetitively and loops are our best friend to eliminate the overhead of code redundancy. You’ll often need to iterate over each element of lists or dictionaries and loops come in handy for that. “While” and “For” are 2 types of loops. Focus on:

range()

Python Fundamentals for Data Science

  • Iterating lists and appending(or any other task with list items) elements in a particular order
Python Fundamentals for Data Science
  • Using break , pass , and continue keywords.

List Comprehension

A sophisticated and succinct way of creating a list using iterable followed by a for clause. For example, you can create a list of 9 cubes as shown in the example below using list comprehension.

Python Fundamentals for Data Science

Functions

Working on big projects with similar tasks being performed, maintaining code becomes a task and a convenient way to manage our code is using functions. A function is a block of code that performs some operations on the input data and gives you the desired output.

Makes the code more readable, reduces redundancy, makes the code reusable, and saves time.

Python uses indentation to have blocks of code. This is an example of a function:

Python Fundamentals for Data Science

We define a function using the def keyword followed by the name of the function and arguments(input) within the parentheses and a colon. The body of the function is the indented code block and the output is then returned as output.

Python Fundamentals for Data Science

Calling functions —you call a function by specifying the name and passing the arguments within the parentheses as per the definition of it.

More examples and details here .

4. Object-Oriented programming and using external libraries

We have been using the helper methods for lists, dictionaries, and other data types but where are these coming from? When we say list or dict, we are actually interacting with a list class object or a dict class object. Printing the type of a dictionary object will show you that it is a class dict object.

Python Fundamentals for Data Science

These are all pre-defined classes in the python language and they make our tasks very easy and convenient.

Now, the objects are instance of a class and are defined as an encapsulation of variables(data) and functions into a single entity. They have access to the variables(attributes) and methods(functions) from classes.

Now, the question is can we create our own custom classes and objects? The answer is YES.

Here is how you define your class and the object of it:

Python Fundamentals for Data Science

You can then access the attributes and methods using the dot(.) operator.

Python Fundamentals for Data Science

Using External Libraries/Modules

One of the reasons for using Python for Data Science is its amazing community that develops high-quality packages for different domains and problems. Using external libraries and modules is an integral part of working on projects in python.

These libraries and modules have defined classes, attributes, and methods that we can use to accomplish our tasks. For example, the math library contains many mathematical functions that we can use to carry out our calculations. These are .py files. You should learn to:

  • Import libraries in your workspace
Python Fundamentals for Data Science
  • Using the help function to learn about the library or function

Python Fundamentals for Data Science

  • Importing the required function directly:
Python Fundamentals for Data Science
  • You should learn to read the documentation of the well-known packages like pandas, numpy and sklearn and use them in your projects

以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持 码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

编写高质量代码:改善Python程序的91个建议

编写高质量代码:改善Python程序的91个建议

张颖、赖勇浩 / 机械工业出版社 / 2014-6 / 59.00元

在通往“Python技术殿堂”的路上,本书将为你编写健壮、优雅、高质量的Python代码提供切实帮助!内容全部由Python编码的最佳实践组成,从基本原则、惯用法、语法、库、设计模式、内部机制、开发工具和性能优化8个方面深入探讨了编写高质量Python代码的技巧与禁忌,一共总结出91条宝贵的建议。每条建议对应Python程序员可能会遇到的一个问题。本书不仅以建议的方式从正反两方面给出了被实践证明为......一起来看看 《编写高质量代码:改善Python程序的91个建议》 这本书的介绍吧!

URL 编码/解码
URL 编码/解码

URL 编码/解码

html转js在线工具
html转js在线工具

html转js在线工具

UNIX 时间戳转换
UNIX 时间戳转换

UNIX 时间戳转换