# Importing libraries (at the top)
import numpy as np
import pandas as pd
import altair as alt
10 Basics
10.1 Importing Libraries
Importing libraries is essential for accessing additional Python features. We import them at the top of the file for clarity and efficiency. An alias (a short name) makes it easier to reference them throughout your code.
Insert the following code in a new code cell directly under your title and execute it:
10.2 Variables and Data Types
Variables are used to store different kinds of data in Python. Data types define the nature of this data and influence how it can be used.
Add a new code cell, then copy and paste the code provided below into that cell. Make sure to run the cell afterward.
Example Code:
# Assigning variables (using descriptive names)
= "Alice" # String for a person's name
name = 30 # Integer for age
age = 5.5 # Float for height in feet
height = [1, 2, 3] # List of integers
numbers = {"name": "Alice", "age": 30} # Dictionary for person data person
Explanation:
name
is a string that stores a sequence of characters representing a name.age
is an integer variable representing a whole number.height
is a float representing a number with a decimal point.numbers
is a list, which is an ordered collection of items.person
is a dictionary consisting of key-value pairs that can store multiple attributes.
To inspect the values stored in these variables, you can use the print
function:
print(name) # Output: Alice
print(age) # Output: 30
print(numbers) # Output: [1, 2, 3]
print(person) # Output: {'name': 'Alice', 'age': 30}
Alice
30
[1, 2, 3]
{'name': 'Alice', 'age': 30}
10.3 Basic Operators and Calculations
Operators allow you to perform calculations, comparisons, and logical operations.
Example Code:
# Arithmetic operations with multiple variables
= 10
length = 5 width
# Calculate the area and perimeter
= length * width # Multiplication
area = 2 * (length + width) # Addition and multiplication
perimeter
# Display perimeter
perimeter
30
# Comparison operations
= length == width # Check if length and width are equal
is_square
is_square
False
# Logical operations
= (area > 20) and (perimeter > 10) # Logical AND
is_large_rectangle
is_large_rectangle
True
Explanation:
length
andwidth
are multiplied to calculate thearea
.perimeter
is calculated by addinglength
andwidth
together and then multiplying by 2.is_square
checks if the two sides are of equal length.is_large_rectangle
uses logical operators to determine if the rectangle is both large in area and perimeter.
10.4 Working with DataFrames
A DataFrame is a two-dimensional, tabular data structure provided by Pandas. It allows you to efficiently manipulate data, similar to a spreadsheet or SQL table.
Example Code:
# Prepare data
= { "Numbers": [1, 2, 3, 4, 5],
data "Squares": np.array([1, 2, 3, 4, 5]) ** 2}
# Creating a DataFrame with Pandas
= pd.DataFrame(data)
df
# Inspecting the DataFrame
df
Numbers | Squares | |
---|---|---|
0 | 1 | 1 |
1 | 2 | 4 |
2 | 3 | 9 |
3 | 4 | 16 |
4 | 5 | 25 |
Explanation:
- The dictionary
data
contains two lists: numbers and their squares. - The DataFrame
df
is created from this dictionary using thepd.DataFrame
function, which organizes the data into rows and columns. - Typing just
df
in a Jupyter Notebook cell displays the DataFrame in a tabular format.
10.5 Understanding Methods
A method is a function that is associated with an object (like a DataFrame, list, or string) and is accessed using dot notation (.
). This makes it easy to manipulate the object and retrieve information.
Example Code:
df.head()
is a method that displays the first few rows of a DataFrame to preview the data:
df.head()
Numbers | Squares | |
---|---|---|
0 | 1 | 1 |
1 | 2 | 4 |
2 | 3 | 9 |
3 | 4 | 16 |
4 | 5 | 25 |
df.describe()
provides summary statistics like mean, count, and standard deviation. Additionally, we use the method .round(2)
to round the values:
round(2) df.describe().
Numbers | Squares | |
---|---|---|
count | 5.00 | 5.00 |
mean | 3.00 | 11.00 |
std | 1.58 | 9.67 |
min | 1.00 | 1.00 |
25% | 2.00 | 4.00 |
50% | 3.00 | 9.00 |
75% | 4.00 | 16.00 |
max | 5.00 | 25.00 |
General Method Concept:
- Object: An entity containing data and functionality (e.g., DataFrame, list, or string).
- Method: A function associated with an object.
- Dot Notation (
.
): Used to call a method on an object.
10.6 Plotting Data
After understanding the structure of a DataFrame, you can visualize its data using Altair.
Example Code:
# Creating the chart
alt.Chart(df).mark_point().encode(='Numbers',
x='Squares'
y )
Explanation:
- Altair Library: A declarative visualization library that simplifies the process of creating charts and graphs.
alt.Chart(df)
: Creates a new chart object from the DataFramedf
.mark_bar()
: Specifies the type of chart (scatter plot in this case).encode
: Maps DataFrame columns to chart attributes.x='Numbers'
: Sets the horizontal axis to use theNumbers
column.y='Squares'
: Sets the vertical axis to use theSquares
column.
This concise syntax allows for rapid creation and customization of visualizations.