📌 Overview — What is Python?
▼The world's most popular beginner-friendly programming language
Created 1991 by Guido van Rossum Type Interpreted, high-level Typing Dynamically typed Use Web, AI/ML, Data Science, AutomationTeaching hook: "Python reads almost like English — if you can describe what you want, you can probably write it in Python."
Python is a general-purpose programming language that prioritizes readability and simplicity. It uses indentation instead of curly braces, making code look clean and consistent.
What Do These Terms Mean?
🔄 Interpreted
Python runs your code line by line, translating each line to machine language on the fly. You don't need a separate "compile" step — just write and run.
Like a live translator at a conference — translates each sentence as you speak.
Compiled languages (C, Java) translate the entire program first, then run it — like translating a whole book before publishing.
🏔️ High-Level
Python is close to human language, not machine code. You don't manage memory, registers, or hardware directly — Python handles that for you.
Like ordering food at a restaurant — you say "I want biryani", you don't go into the kitchen.
Low-level languages (C, Assembly) give you direct hardware control but require more effort.
🔀 Dynamically Typed
You don't declare the type of a variable — Python figures it out automatically at runtime. A variable can even change type!
x = 10 # x is an int
x = "hello" # now x is a str ✅
In statically typed languages (Java, C#), you must declare: int x = 10; and x = "hello" would be an error.
🆚 Quick Comparison
| Python | Java / C | |
|---|---|---|
| Execution | Interpreted | Compiled |
| Level | High-level | Mid / Low |
| Typing | Dynamic | Static |
| Speed | Slower | Faster |
| Ease | Easier | More verbose |
🐍 Python
if age >= 18:
print("You can vote!")
else:
print("Too young to vote")
☕ Java (same logic)
if (age >= 18) {
System.out.println("You can vote!");
} else {
System.out.println("Too young to vote");
}
Notice: Python has no semicolons, no curly braces, no boilerplate — just clean logic.
🤔 Why Python?
▼📖 Easy to Learn
Simple syntax that reads like English. No complex boilerplate. Great first language.
🌍 Huge Community
Millions of developers, thousands of free tutorials, Stack Overflow answers for everything.
📦 Rich Libraries
NumPy, Pandas, Matplotlib, Scikit-learn, TensorFlow, Flask, Django — ready-made tools for any task.
🤖 AI & Data Science
#1 language for Machine Learning, Deep Learning, and Data Analysis.
💼 Career Demand
Top-5 most in-demand language globally. Used at Google, Microsoft, Netflix, Instagram, NASA.
🔄 Versatile
Web apps, APIs, automation scripts, desktop apps, games, IoT — Python does it all.
💡 Fun fact: Python is named after Monty Python's Flying Circus (a British comedy show), not the snake!
Where is Python used?
| Domain | What Python Does | Popular Libraries |
|---|---|---|
| Data Science | Analyze data, build dashboards, find patterns | Pandas, NumPy, Matplotlib |
| Machine Learning | Train models, make predictions | Scikit-learn, XGBoost |
| Deep Learning | Image recognition, NLP, speech | TensorFlow, PyTorch |
| Web Development | Build websites & APIs | Django, Flask, FastAPI |
| Automation | Automate repetitive tasks, scraping | Selenium, BeautifulSoup |
| DevOps | Infrastructure scripts, CI/CD | Ansible, Fabric |
⚙️ Setup & First Program
▼Step 1 — Install Python
🪟 Windows
- Go to python.org/downloads
- Download Python 3.12+ installer
- ✅ Check "Add Python to PATH"
- Click Install Now
🍎 macOS / 🐧 Linux
Usually pre-installed. Verify:
python3 --version
# Python 3.12.x
Or install via brew install python3 (macOS)
Step 2 — Choose an Editor
⭐ VS Code
Free, lightweight, Python extension with IntelliSense & debugging. Recommended.
🧪 Jupyter Notebook
Interactive, cell-by-cell execution. Great for data exploration.
🌐 Google Colab
Browser-based, no setup needed. Free GPU for ML. Perfect for beginners.
Step 3 — Your First Program 🎉
# hello.py — your very first Python program
print("Hello, World!")
print("Welcome to Python 🐍")
▶️ Run it: Open terminal → python hello.py → see the output!
Hello, World!
Welcome to Python 🐍
🧪 What is Jupyter Notebook?
Jupyter Notebook is an interactive coding environment where you write and run Python in small chunks called cells — instead of writing an entire .py file. You see the output immediately below each cell, making it perfect for learning, experimentation, and data analysis.
📄 Regular .py File
Write all code → Run everything at once → See all output at the end
Like writing a full essay and submitting it
🧪 Jupyter Notebook (.ipynb)
Write code in cells → Run one cell at a time → See output instantly below
Like having a conversation — ask a question, get an answer
How to Get Jupyter
Option 1: Install Locally
pip install notebook
jupyter notebook
Opens in your browser at localhost:8888
Option 2: VS Code
Install the Jupyter extension in VS Code. Create a .ipynb file and run cells right inside the editor.
⭐ Best of both worlds
Option 3: Google Colab
Go to colab.research.google.com — it's a free, cloud-hosted Jupyter notebook. No install needed!
🌐 Zero setup
How It Works — Cell by Cell
In Jupyter, you write code in cells and press Shift + Enter to run each one. The output appears right below:
💡 Key concept: Variables persist across cells — once you define name in Cell 1, you can use it in Cell 2, 3, 4, etc. Think of the entire notebook as one shared workspace.
Jupyter Cell Types
💻 Code Cell
Write and run Python code. Output appears below. This is where your logic lives.
📝 Markdown Cell
Write notes, headings, explanations using Markdown. Great for documenting your thought process alongside code.
🎓 For this course: We'll be using Jupyter Notebooks for all our coding exercises. It lets you experiment step by step, see results immediately, and keep notes alongside your code.
📦 Variables & Data Types
▼A variable is a named container that stores a value. In Python, you don't need to declare the type — Python figures it out automatically (dynamic typing).
# Creating variables — no type declaration needed!
name = "Alice" # str (text)
age = 25 # int (whole number)
height = 5.6 # float (decimal number)
is_student = True # bool (True/False)
print(name, age, height, is_student)
# Alice 25 5.6 True
Core Data Types
| Type | What It Stores | Example | Check with |
|---|---|---|---|
int | Whole numbers | 42, -7, 0 | type(42) |
float | Decimal numbers | 3.14, -0.5 | type(3.14) |
str | Text (strings) | "hello", 'world' | type("hi") |
bool | True / False | True, False | type(True) |
NoneType | No value / empty | None | type(None) |
Type Checking & Conversion
# Check the type of a variable
x = 42
print(type(x)) # <class 'int'>
# Convert between types (type casting)
num_str = "100"
num_int = int(num_str) # str → int
num_float = float(num_str) # str → float
back_to_str = str(42) # int → str
print(num_int, num_float, back_to_str)
# 100 100.0 42
⚠️ Common mistake: int("hello") will crash! You can only convert strings that actually look like numbers.
Variable Naming Rules
✅ Valid Names
my_name = "Alice"
age2 = 25
_private = "secret"
MAX_SIZE = 100
❌ Invalid Names
# 2name = "Bob" ← starts with number
# my-name = "Eve" ← has hyphen
# class = "Math" ← reserved keyword
# my name = "Pat" ← has space
Convention: Use snake_case for variables and functions in Python (e.g., student_name, not studentName).
🔤 Strings — Working with Text
Strings (str) are sequences of characters. Python makes it easy to create, combine, and manipulate them.
# Creating strings
single = 'Hello'
double = "World"
multi = """This is a
multi-line string"""
# Concatenation (joining)
greeting = single + " " + double
print(greeting) # Hello World
# Repetition
print("Ha" * 3) # HaHaHa
# Length
print(len(greeting)) # 11
f-Strings (Formatted Strings) ⭐
The modern, recommended way to embed variables inside strings:
name = "Alice"
age = 25
city = "Hyderabad"
# f-string — put f before the quote, use {variable}
print(f"My name is {name}, I'm {age} years old")
# My name is Alice, I'm 25 years old
# You can even put expressions inside {}
print(f"Next year I'll be {age + 1}")
# Next year I'll be 26
Useful String Methods
text = " Hello, Python World! "
print(text.strip()) # "Hello, Python World!" (remove spaces)
print(text.lower()) # " hello, python world! "
print(text.upper()) # " HELLO, PYTHON WORLD! "
print(text.replace("Python", "Java")) # " Hello, Java World! "
print(text.split(",")) # [' Hello', ' Python World! ']
print("Python" in text) # True
String Indexing & Slicing
Every character in a string has a position number (index). Slicing lets you extract a portion of the string using the syntax [start:stop:step].
📍 Indexing — Accessing One Character
word = "PYTHON"
# P Y T H O N
# Index: 0 1 2 3 4 5
# Neg: -6 -5 -4 -3 -2 -1
print(word[0]) # P (first character)
print(word[3]) # H (4th character)
print(word[-1]) # N (last character)
print(word[-2]) # O (second from end)
💡 Positive index counts from the left (starts at 0). Negative index counts from the right (starts at -1).
✂️ Slicing — [start : stop : step]
Slicing extracts a substring. The syntax has 3 parts:
| Part | Meaning | Default |
|---|---|---|
start | Where to begin (inclusive) | 0 (beginning) |
stop | Where to end (exclusive — NOT included!) | end of string |
step | How many characters to skip | 1 (every character) |
word = "PYTHON"
# [start:stop] — from start up to (but NOT including) stop
print(word[0:3]) # PYT (index 0, 1, 2 — stop at 3)
print(word[2:5]) # THO (index 2, 3, 4)
# Omit start → starts from beginning
print(word[:4]) # PYTH (first 4 characters)
# Omit stop → goes to the end
print(word[2:]) # THON (from index 2 to end)
# Both omitted → entire string
print(word[:]) # PYTHON (full copy)
🦘 Using Step
word = "PYTHON"
# [::step] — skip characters
print(word[::2]) # PTO (every 2nd character: P_T_O_)
print(word[::3]) # PH (every 3rd character: P__H__)
# Negative step → go BACKWARDS
print(word[::-1]) # NOHTYP (reversed!)
print(word[::-2]) # NHY (reversed, every 2nd)
# Combine start:stop:step
print(word[1:5:2]) # YH (index 1 to 4, every 2nd)
📝 More Slicing Examples
msg = "Hello, World!"
print(msg[0:5]) # Hello
print(msg[7:]) # World!
print(msg[-6:]) # orld!
print(msg[-6:-1]) # orld
# Check if palindrome
word = "madam"
print(word == word[::-1]) # True ✅ (it's a palindrome!)
💡 Memory trick: Think of slicing as [start : stop : step] = "From where? To where? How fast?"
The stop index is never included — like a hotel checkout time: "checkout by 11" means you leave before 11.
💡 Remember: Strings are immutable in Python — you can't change a character in place. You create a new string instead.
🔒 What does "immutable" mean? (click to expand)
Immutable = once created, it cannot be changed. If you try to modify a character inside a string, Python will throw an error:
name = "Alice"
# ❌ This will CRASH — strings can't be changed in place
name[0] = "B"
# TypeError: 'str' object does not support item assignment
So how do you "change" a string? You create a brand new string:
name = "Alice"
# ✅ Create a NEW string using concatenation
new_name = "B" + name[1:]
print(new_name) # Blice
# ✅ Or use replace() — also creates a new string
new_name = name.replace("A", "B")
print(new_name) # Blice
# The original is UNCHANGED
print(name) # Alice (still the same!)
💡 Analogy: Think of a string like a printed page — you can't erase a letter on a printed page. But you can photocopy it with changes and throw away the old one. That's what Python does — it creates a new string every time.
Mutable vs Immutable — Quick Reference
🔒 Immutable (can't change)
str—"hello"int—42float—3.14bool—Truetuple—(1, 2, 3)frozenset—frozenset([1,2])
🔓 Mutable (can change)
list—[1, 2, 3]dict—{"a": 1}set—{1, 2, 3}
These can be modified in place — add, remove, or change items without creating a new object.
🗂️ Data Structures — Organizing Your Data
▼So far we've seen simple values like int, str, bool. But real programs need to store collections of data. Python gives you four built-in data structures, each designed for different situations.
Teaching hook: "Think of your phone — your contacts are a list, your settings are a dictionary, your unique app permissions are a set, and your screen resolution is a tuple."
Key Terms
📌 Ordered
Items stay in the sequence you added them. You can access them by position (index 0, 1, 2…). If it's unordered, there's no guaranteed position.
✏️ Mutable
You can change, add, or remove items after creation. Immutable means once created, the contents are locked — you must create a new one instead.
🔢 Indexed
You can access an item using [position] or [key]. Non-indexed structures require looping to find an item.
🚫 Duplicates
Whether the same value can appear more than once. Sets and dictionary keys enforce uniqueness automatically.
The Big Picture
List [] |
Tuple () |
Set {} |
Dictionary {k:v} |
|
|---|---|---|---|---|
| Ordered? | ✅ Yes | ✅ Yes | ❌ No | ✅ Yes (3.7+) |
| Mutable? | ✅ Yes | ❌ No | ✅ Yes | ✅ Yes |
| Duplicates? | ✅ Allowed | ✅ Allowed | ❌ No | ❌ Keys unique |
| Indexed? | ✅ By position | ✅ By position | ❌ No | ✅ By key |
| Syntax | [1, 2, 3] | (1, 2, 3) | {1, 2, 3} | {"a": 1} |
📋 1. Lists — Ordered & Mutable
A list is like a shopping cart — you can add, remove, reorder, and change items anytime.
Creating & Accessing Lists
# Creating lists
fruits = ["apple", "banana", "cherry", "mango"]
numbers = [10, 20, 30, 40, 50]
mixed = ["Alice", 25, True, 3.14] # different types OK!
empty = [] # empty list
# Accessing by index (0-based)
print(fruits[0]) # apple (first)
print(fruits[-1]) # mango (last)
print(fruits[1:3]) # ['banana', 'cherry'] (slice)
# Check if item exists
print("banana" in fruits) # True
print("grape" in fruits) # False
Modifying Lists (Add, Remove, Change)
fruits = ["apple", "banana", "cherry"]
# ➕ Adding items
fruits.append("mango") # Add to end
fruits.insert(1, "kiwi") # Add at index 1
fruits.extend(["grape", "fig"]) # Add multiple items
print(fruits)
# ['apple', 'kiwi', 'banana', 'cherry', 'mango', 'grape', 'fig']
# ➖ Removing items
fruits.remove("banana") # Remove by value (first match)
popped = fruits.pop() # Remove & return last item
fruits.pop(0) # Remove by index
print(f"Popped: {popped}") # Popped: fig
# ✏️ Changing items
fruits[0] = "blueberry" # Replace at index
# 📏 Length
print(len(fruits)) # number of items
Sorting, Reversing & Useful Operations
numbers = [5, 2, 8, 1, 9, 3]
# Sorting
print(sorted(numbers)) # [1, 2, 3, 5, 8, 9] (new list)
numbers.sort() # sorts in-place (modifies original)
numbers.sort(reverse=True) # descending: [9, 8, 5, 3, 2, 1]
# Reversing
numbers.reverse() # reverses in-place
# Math on lists
print(sum(numbers)) # 28
print(min(numbers)) # 1
print(max(numbers)) # 9
print(numbers.count(5)) # 1 (how many times 5 appears)
print(numbers.index(8)) # position of first 8
# Joining lists
a = [1, 2]
b = [3, 4]
print(a + b) # [1, 2, 3, 4]
print(a * 3) # [1, 2, 1, 2, 1, 2]
⭐ List Comprehensions — One-Liner Magic
Create new lists by transforming or filtering existing ones — in a single line.
# Basic: squares of 1 to 5
squares = [x**2 for x in range(1, 6)]
print(squares) # [1, 4, 9, 16, 25]
# With filter: only even numbers
nums = [1, 2, 3, 4, 5, 6, 7, 8]
evens = [x for x in nums if x % 2 == 0]
print(evens) # [2, 4, 6, 8]
# Transform: uppercase each fruit
fruits = ["apple", "banana", "cherry"]
upper = [f.upper() for f in fruits]
print(upper) # ['APPLE', 'BANANA', 'CHERRY']
# Nested: flatten a 2D list
matrix = [[1, 2], [3, 4], [5, 6]]
flat = [n for row in matrix for n in row]
print(flat) # [1, 2, 3, 4, 5, 6]
💡 Real-world analogy: A list is like a playlist — ordered, you can add/remove songs, rearrange them, and have duplicates.
📌 2. Tuples — Ordered & Immutable
A tuple is like a sealed envelope — once created, you cannot change its contents. Use it for data that should stay fixed.
Creating & Accessing Tuples
# Creating tuples
coordinates = (10, 20)
colors = ("red", "green", "blue")
singleton = (42,) # ⚠️ single-element tuple needs a comma!
from_list = tuple([1, 2, 3]) # convert list → tuple
# Accessing (same as list)
print(colors[0]) # red
print(colors[-1]) # blue
print(colors[0:2]) # ('red', 'green')
print(len(colors)) # 3
print("red" in colors) # True
Tuple Unpacking & Multiple Assignment
# Unpacking — assign each element to a variable
point = (10, 20, 30)
x, y, z = point
print(f"x={x}, y={y}, z={z}") # x=10, y=20, z=30
# Swap two variables (Python magic!)
a = 5
b = 10
a, b = b, a
print(a, b) # 10 5
# Return multiple values from a function
def get_min_max(numbers):
return min(numbers), max(numbers)
lo, hi = get_min_max([4, 1, 9, 2])
print(f"Min: {lo}, Max: {hi}") # Min: 1, Max: 9
# Star unpacking — grab the rest
first, *middle, last = (1, 2, 3, 4, 5)
print(first, middle, last) # 1 [2, 3, 4] 5
Why Tuples? (vs Lists)
✅ Use Tuples When
- Data should not change (coordinates, RGB)
- Using as dictionary keys (lists can't be keys!)
- Returning multiple values from functions
- You want slightly faster performance
✅ Use Lists When
- Data changes (add/remove/update items)
- You need sorting, filtering
- Building a collection over time
- Order + mutability both matter
⚠️ Gotcha: (42) is just the number 42 in parentheses. For a single-element tuple, you need a trailing comma: (42,)
🎯 3. Sets — Unordered & Unique
A set is like a bag of unique items — no duplicates allowed, and there's no fixed order.
Creating Sets & Removing Duplicates
# Creating sets
unique_nums = {1, 2, 3, 4, 5}
vowels = {"a", "e", "i", "o", "u"}
empty_set = set() # ⚠️ NOT {} — that creates an empty dict!
# 🔥 Killer feature: auto-remove duplicates
names = ["Alice", "Bob", "Alice", "Charlie", "Bob"]
unique_names = set(names)
print(unique_names) # {'Alice', 'Bob', 'Charlie'}
# Convert back to list if needed
unique_list = list(unique_names)
print(unique_list) # ['Alice', 'Bob', 'Charlie']
Adding, Removing & Checking Membership
skills = {"Python", "SQL", "Excel"}
# ➕ Adding
skills.add("Tableau") # add one item
skills.update(["R", "Java"]) # add multiple items
# ➖ Removing
skills.discard("Excel") # remove (no error if missing)
skills.remove("SQL") # remove (KeyError if missing!)
# 🔍 Membership test — SUPER FAST (O(1))
print("Python" in skills) # True
print("C++" in skills) # False
⭐ Set Operations — Union, Intersection, Difference
Sets support mathematical operations — perfect for comparing groups!
frontend = {"HTML", "CSS", "JavaScript", "React"}
backend = {"Python", "JavaScript", "SQL", "Django"}
# Union — all skills from both (no duplicates)
all_skills = frontend | backend
print(all_skills)
# {'HTML', 'CSS', 'JavaScript', 'React', 'Python', 'SQL', 'Django'}
# Intersection — skills in BOTH
common = frontend & backend
print(common) # {'JavaScript'}
# Difference — in frontend but NOT in backend
only_front = frontend - backend
print(only_front) # {'HTML', 'CSS', 'React'}
# Symmetric difference — in one but NOT both
exclusive = frontend ^ backend
print(exclusive)
# {'HTML', 'CSS', 'React', 'Python', 'SQL', 'Django'}
💡 Real example: "Find students enrolled in Math but not Science" → math_students - science_students
Frozen Sets — Immutable Sets
# A frozenset is an immutable set — cannot add/remove items
constants = frozenset([3.14, 2.718, 1.618])
print(3.14 in constants) # True
# Can be used as dictionary keys or inside other sets
pair = frozenset([1, 2])
my_dict = {pair: "a pair"} # ✅ works!
💡 Real-world analogy: A set is like a bag of unique Lego colors — you can't have two reds, and you just reach in without caring about order.
📖 4. Dictionaries — Key-Value Pairs
A dictionary is like a phone book — you look up a name (key) to find a number (value). Every key must be unique.
Creating & Accessing Dictionaries
# Creating a dictionary
student = {
"name": "Alice",
"age": 25,
"city": "Hyderabad",
"courses": ["Python", "Data Science"]
}
# Accessing values
print(student["name"]) # Alice
print(student.get("age")) # 25
print(student.get("phone", "N/A")) # N/A (default if missing)
print(student["courses"][0]) # Python (nested access)
Adding, Updating & Removing
student = {"name": "Alice", "age": 25}
# ➕ Adding / Updating
student["email"] = "alice@email.com" # add new key
student["age"] = 26 # update existing
student.update({"city": "Mumbai", "gpa": 3.8}) # add multiple
# ➖ Removing
del student["email"] # delete key
removed = student.pop("gpa") # remove & return value
print(f"Removed GPA: {removed}") # 3.8
# 📏 Info
print(len(student)) # number of keys
print("name" in student) # True
Looping Through Dictionaries
scores = {"Math": 90, "Science": 85, "English": 78}
# Loop through keys
for subject in scores:
print(subject) # Math, Science, English
# Loop through values
for marks in scores.values():
print(marks) # 90, 85, 78
# Loop through key-value pairs ⭐
for subject, marks in scores.items():
print(f"{subject}: {marks}")
# Math: 90
# Science: 85
# English: 78
Dictionary Comprehension
# Create a dict of squares
squares = {x: x**2 for x in range(1, 6)}
print(squares) # {1: 1, 2: 4, 3: 9, 4: 16, 5: 25}
# Filter: only passing scores
scores = {"Math": 90, "Art": 45, "Science": 85, "PE": 30}
passed = {k: v for k, v in scores.items() if v >= 50}
print(passed) # {'Math': 90, 'Science': 85}
# Swap keys and values
swapped = {v: k for k, v in scores.items()}
print(swapped) # {90: 'Math', 45: 'Art', 85: 'Science', 30: 'PE'}
Nested Dictionaries
# A dictionary of dictionaries
classroom = {
"Alice": {"age": 25, "grade": "A"},
"Bob": {"age": 22, "grade": "B"},
"Charlie": {"age": 23, "grade": "A"}
}
# Access nested value
print(classroom["Alice"]["grade"]) # A
# Loop through nested dict
for name, info in classroom.items():
print(f"{name}: age {info['age']}, grade {info['grade']}")
💡 Real-world analogy: A dictionary is like a student ID card — each field (name, roll number, department) is a key, and the actual data is the value.
🔤 Bonus: Strings Are Sequences Too!
Strings behave like immutable lists of characters — you can index, slice, and loop through them.
word = "PYTHON"
print(word[0]) # P
print(word[-1]) # N
print(word[1:4]) # YTH
print(len(word)) # 6
for ch in word:
print(ch, end=" ")
# P Y T H O N
🔢 Bonus: range() — Generate Number Sequences
range() produces a sequence of numbers on the fly — it doesn't store them all in memory, making it very efficient. You'll use it constantly with for loops.
# range(stop) — 0 to stop-1
print(list(range(5)))
# [0, 1, 2, 3, 4]
# range(start, stop) — start to stop-1
print(list(range(2, 8)))
# [2, 3, 4, 5, 6, 7]
# range(start, stop, step) — with custom step
print(list(range(0, 20, 3)))
# [0, 3, 6, 9, 12, 15, 18]
# Counting backwards
print(list(range(10, 0, -1)))
# [10, 9, 8, 7, 6, 5, 4, 3, 2, 1]
# Even numbers from 2 to 10
print(list(range(2, 11, 2)))
# [2, 4, 6, 8, 10]
Common Uses with Loops
# Repeat something 5 times
for i in range(5):
print(f"Attempt {i + 1}")
# Loop with index over a list
fruits = ["apple", "banana", "cherry"]
for i in range(len(fruits)):
print(f"{i}: {fruits[i]}")
# 0: apple
# 1: banana
# 2: cherry
# Quick sum: 1 + 2 + ... + 100
total = sum(range(1, 101))
print(total) # 5050
💡 Key point: range() is lazy — it generates numbers one at a time instead of creating a whole list. That's why range(1000000) uses almost no memory, but list(range(1000000)) would create a million-item list.
🧭 When to Use What? — Decision Guide
| Scenario | Best Choice | Why |
|---|---|---|
| Shopping cart items | List | Order matters, items change |
| GPS coordinates (lat, lng) | Tuple | Fixed data, should not change |
| Unique tags on a blog post | Set | No duplicates, order doesn't matter |
| Student profile (name, age, city) | Dictionary | Label each piece of data with a key |
| Collecting survey responses | List | Append as they come, duplicates OK |
| Days of the week | Tuple | Never changes (Mon–Sun) |
| Unique visitors to a website | Set | Auto-removes repeat visitors |
| Config settings (key=value) | Dictionary | Look up by setting name |
| Return multiple values from function | Tuple | Lightweight, unpacking support |
| Counting word frequencies | Dictionary | word → count mapping |
| Remove duplicates from a list | Set | list(set(my_list)) |
| Undo history (stack) | List | append() + pop() = stack behavior |
Decision Flowchart
Ask yourself these questions:
1️⃣ Do I need key-value pairs? → Dictionary
2️⃣ Do I need only unique items? → Set
3️⃣ Should the data never change? → Tuple
4️⃣ Everything else? → List (the default choice)
Converting Between Types
# List ↔ Tuple ↔ Set
my_list = [1, 2, 2, 3, 3, 3]
my_tuple = tuple(my_list) # (1, 2, 2, 3, 3, 3)
my_set = set(my_list) # {1, 2, 3} — duplicates removed!
back_to_list = list(my_set) # [1, 2, 3]
# Dict keys/values to list
d = {"a": 1, "b": 2}
print(list(d.keys())) # ['a', 'b']
print(list(d.values())) # [1, 2]
print(list(d.items())) # [('a', 1), ('b', 2)]
🎓 Summary: Python's 4 built-in collections give you the right tool for every situation. Start with lists (most common), use dicts when you need labels, sets for uniqueness, and tuples for immutability.
➕ Operators
▼Operators let you perform calculations, comparisons, and logical operations.
Arithmetic Operators
a = 15
b = 4
print(a + b) # 19 Addition
print(a - b) # 11 Subtraction
print(a * b) # 60 Multiplication
print(a / b) # 3.75 Division (always float)
print(a // b) # 3 Floor division (integer)
print(a % b) # 3 Modulus (remainder)
print(a ** b) # 50625 Exponent (15^4)
Comparison Operators
These return True or False:
x = 10
y = 20
print(x == y) # False (equal to?)
print(x != y) # True (not equal?)
print(x < y) # True (less than?)
print(x > y) # False (greater than?)
print(x <= y) # True (less or equal?)
print(x >= y) # False (greater or equal?)
Logical Operators
a = True
b = False
print(a and b) # False (both must be True)
print(a or b) # True (at least one True)
print(not a) # False (flip the value)
💡 Tip: = is assignment (store a value). == is comparison (check equality). Mixing them up is the #1 beginner mistake!
📥 Input / Output
▼print() — Showing Output
# Basic print
print("Hello!")
# Multiple values
print("Age:", 25, "City:", "Mumbai")
# Age: 25 City: Mumbai
# Custom separator
print("A", "B", "C", sep="-")
# A-B-C
# Prevent new line at end
print("Loading", end="...")
print("Done!")
# Loading...Done!
input() — Getting User Input
# input() always returns a STRING
name = input("What's your name? ")
print(f"Hello, {name}!")
# For numbers, you must convert!
age = int(input("Enter your age: "))
year = 2026 - age
print(f"You were born around {year}")
⚠️ Key point: input() always returns a string. If you need a number, wrap it with int() or float().
🔀 Conditionals — Making Decisions
▼Conditionals let your program choose different paths based on conditions.
age = 18
if age >= 18:
print("You can vote! ✅")
elif age >= 16:
print("Almost there! 🔜")
else:
print("Too young to vote ❌")
💡 Indentation matters! Python uses 4 spaces to define code blocks. No braces {} — indentation is the structure.
Nested Conditions
score = 85
if score >= 90:
grade = "A"
elif score >= 80:
grade = "B"
elif score >= 70:
grade = "C"
elif score >= 60:
grade = "D"
else:
grade = "F"
print(f"Score: {score} → Grade: {grade}")
# Score: 85 → Grade: B
Combining Conditions
age = 25
has_license = True
if age >= 18 and has_license:
print("You can drive! 🚗")
# Ternary (one-liner)
status = "adult" if age >= 18 else "minor"
print(status) # adult
match-case — Python's Switch Statement (3.10+)
match-case is Python's version of a switch statement. It takes a value and checks it against multiple patterns — when a pattern matches, that block runs. Think of it as a cleaner alternative to writing long if-elif-elif-else chains when you're comparing one variable against many possible values.
When to use match-case:
- Checking one value against many fixed options (menu choices, commands, status codes)
- Replacing long
if-elifchains that all compare the same variable - Handling different message types, API responses, or user actions
When NOT to use: For range checks (x > 10), complex boolean logic, or if you need Python < 3.10 compatibility — stick with if-elif.
command = "start"
match command:
case "start":
print("Starting the engine... 🚀")
case "stop":
print("Stopping the engine... 🛑")
case "pause":
print("Pausing... ⏸️")
case _:
print("Unknown command ❓")
# Starting the engine... 🚀
Matching with Patterns
# Match numbers
status_code = 404
match status_code:
case 200:
print("OK ✅")
case 301 | 302:
print("Redirect ↪️")
case 404:
print("Not Found 🔍")
case 500:
print("Server Error 💥")
case _:
print(f"Status: {status_code}")
# Not Found 🔍
💡 Key points:
• case _: is the default/catch-all (like default: in switch)
• Use | to match multiple values in one case (like case 301 | 302)
• No break needed — Python doesn't fall through like C/Java
• Requires Python 3.10+
🔁 Loops — Repeating Actions
▼for Loop — Iterate Over a Sequence
# Loop through a list
fruits = ["apple", "banana", "cherry"]
for fruit in fruits:
print(f"I like {fruit}")
# I like apple
# I like banana
# I like cherry
range() — Generate Number Sequences
# Print 1 to 5
for i in range(1, 6):
print(i, end=" ")
# 1 2 3 4 5
# Count by 2s
for i in range(0, 10, 2):
print(i, end=" ")
# 0 2 4 6 8
while Loop — Repeat Until a Condition is False
count = 1
while count <= 5:
print(f"Count: {count}")
count += 1
# Count: 1
# Count: 2
# Count: 3
# Count: 4
# Count: 5
break & continue
break — Exit the loop
for i in range(10):
if i == 5:
break
print(i, end=" ")
# 0 1 2 3 4
continue — Skip this iteration
for i in range(6):
if i == 3:
continue
print(i, end=" ")
# 0 1 2 4 5
💡 Real-world analogy: for loop = "Do this for each student in the class." while loop = "Keep studying until you pass the exam."
🧩 Functions — Reusable Code Blocks
▼Functions let you write code once and use it many times. They take inputs (parameters), do something, and optionally return a result.
# Define a function
def greet(name):
return f"Hello, {name}! 👋"
# Call the function
message = greet("Alice")
print(message)
# Hello, Alice! 👋
Parameters & Default Values
def introduce(name, age, city="Unknown"):
print(f"{name} is {age} years old, from {city}")
introduce("Alice", 25, "Mumbai")
# Alice is 25 years old, from Mumbai
introduce("Bob", 30)
# Bob is 30 years old, from Unknown
Multiple Return Values
def calculate(a, b):
return a + b, a - b, a * b
sum_val, diff, product = calculate(10, 3)
print(f"Sum: {sum_val}, Diff: {diff}, Product: {product}")
# Sum: 13, Diff: 7, Product: 30
🔍 Scope — Local vs Global Variables
x = 10 # Global variable
def my_func():
x = 5 # Local variable (different from global x)
print(f"Inside function: {x}")
my_func()
print(f"Outside function: {x}")
# Inside function: 5
# Outside function: 10
The local x inside the function doesn't affect the global x.
📁 File Handling — Read & Write Files
▼Python can read from and write to files on your computer. This is how you save data, load datasets, generate reports, and process logs.
Opening Files — The with Statement
Always use with to open files — it automatically closes the file when done, even if an error occurs.
| Mode | Meaning | Creates file? | Overwrites? |
|---|---|---|---|
"r" | Read (default) | ❌ No — error if missing | — |
"w" | Write | ✅ Yes | ⚠️ Yes — erases old content! |
"a" | Append | ✅ Yes | ❌ No — adds to end |
"r+" | Read + Write | ❌ No | Partial |
Writing to a File
# Create and write to a file
with open("notes.txt", "w") as f:
f.write("Hello from Python!\n")
f.write("This is line 2.\n")
f.write("This is line 3.\n")
# Append (add to end without erasing)
with open("notes.txt", "a") as f:
f.write("This line was appended.\n")
Reading from a File
# Read entire file as one string
with open("notes.txt", "r") as f:
content = f.read()
print(content)
# Read line by line (memory efficient for big files)
with open("notes.txt", "r") as f:
for line in f:
print(line.strip()) # strip() removes \n
# Read all lines into a list
with open("notes.txt", "r") as f:
lines = f.readlines()
print(lines)
# ['Hello from Python!\n', 'This is line 2.\n', ...]
📝 Practical Example — CSV-like Data
# Write student scores
with open("scores.csv", "w") as f:
f.write("Name,Math,Science\n")
f.write("Alice,90,85\n")
f.write("Bob,78,92\n")
f.write("Charlie,88,76\n")
# Read and process
with open("scores.csv", "r") as f:
header = f.readline() # skip header
for line in f:
parts = line.strip().split(",")
name = parts[0]
avg = (int(parts[1]) + int(parts[2])) / 2
print(f"{name}: average = {avg}")
# Alice: average = 87.5
# Bob: average = 85.0
# Charlie: average = 82.0
⚠️ Be careful with "w" mode — it erases everything in the file before writing. Use "a" (append) if you want to add to an existing file without losing data.
💡 Do I need to create the file first? No! When you use "w" or "a" mode, Python creates the file automatically if it doesn't exist. Only "r" (read) mode requires the file to already exist — otherwise you get a FileNotFoundError.
🛡️ Error Handling — try / except
▼Errors (exceptions) happen — a file might not exist, a user might type "abc" instead of a number, or a network call might fail. Error handling lets your program recover gracefully instead of crashing.
Basic try / except
# Without error handling — CRASHES
# num = int("hello") → ValueError!
# With error handling — recovers gracefully
try:
num = int(input("Enter a number: "))
print(f"You entered: {num}")
except ValueError:
print("That's not a valid number! ❌")
How It Works
🔄 Flow
- Python tries the code in the
tryblock - If an error occurs, it jumps to the matching
exceptblock - If no error,
exceptis skipped finallyalways runs (optional)
🧱 Structure
try:
# risky code
except ErrorType:
# handle that error
else:
# runs if NO error
finally:
# ALWAYS runs
Common Exception Types
| Exception | When It Happens | Example |
|---|---|---|
ValueError | Wrong value for conversion | int("abc") |
TypeError | Wrong type in operation | "hello" + 5 |
ZeroDivisionError | Dividing by zero | 10 / 0 |
FileNotFoundError | File doesn't exist | open("missing.txt") |
IndexError | List index out of range | [1,2,3][10] |
KeyError | Dict key doesn't exist | {"a":1}["b"] |
NameError | Variable not defined | print(xyz) |
Catching Multiple Exceptions
try:
x = int(input("Enter a number: "))
result = 100 / x
print(f"Result: {result}")
except ValueError:
print("Not a number!")
except ZeroDivisionError:
print("Can't divide by zero!")
except Exception as e:
print(f"Something went wrong: {e}")
else:
print("Everything worked! ✅")
finally:
print("This always runs (cleanup)")
📝 Practical Example — Safe File Reading
def read_file_safe(filename):
try:
with open(filename, "r") as f:
return f.read()
except FileNotFoundError:
print(f"❌ File '{filename}' not found")
return None
content = read_file_safe("data.txt")
if content:
print(content)
else:
print("Using default data instead")
💡 Best practice: Catch specific exceptions (like ValueError), not just bare except:. A bare except hides bugs and makes debugging harder.
📦 Modules & Packages — Reusing Code
▼A module is a Python file containing functions, classes, and variables that you can import and reuse. Packages are folders of modules. Python comes with hundreds of built-in modules, and you can install thousands more.
Importing Modules
# Import entire module
import math
print(math.sqrt(144)) # 12.0
print(math.pi) # 3.141592653589793
# Import specific function
from math import sqrt, pi
print(sqrt(64)) # 8.0
print(pi) # 3.14...
# Import with alias
import math as m
print(m.floor(3.7)) # 3
print(m.ceil(3.2)) # 4
Useful Built-in Modules
| Module | What It Does | Example |
|---|---|---|
math | Math functions | math.sqrt(25) → 5.0 |
random | Random numbers | random.randint(1, 10) |
datetime | Dates & times | datetime.datetime.now() |
os | File system ops | os.listdir(".") |
json | Read/write JSON | json.loads(text) |
csv | Read/write CSV | csv.reader(file) |
The random Module — Examples
import random
# Random integer between 1 and 10
print(random.randint(1, 10))
# Random float between 0 and 1
print(random.random())
# Pick a random item from a list
colors = ["red", "blue", "green"]
print(random.choice(colors))
# Shuffle a list in place
cards = [1, 2, 3, 4, 5]
random.shuffle(cards)
print(cards) # e.g. [3, 1, 5, 2, 4]
Installing External Packages (pip)
# Install a package from PyPI (Python Package Index)
pip install requests
pip install pandas numpy matplotlib
# Then use in your code
import requests
response = requests.get("https://api.github.com")
print(response.json())
📝 Creating Your Own Module
In Jupyter Notebook, use the %%writefile magic command to create a .py file directly from a cell:
%%writefile myutils.py
# myutils.py — your custom module
def greet(name):
return f"Hello, {name}! 👋"
def add(a, b):
return a + b
PI = 3.14159
Running that cell creates myutils.py in your notebook's directory. Then import it in the next cell:
# Import and use your module
from myutils import greet, add, PI
print(greet("Alice")) # Hello, Alice! 👋
print(add(10, 20)) # 30
print(PI) # 3.14159
Any .py file can be a module — just import it by filename (without .py).
💡 Rule of thumb: Before writing something from scratch, search "python module for ___". There's probably a package for it already!
📊 NumPy & Pandas — Data Science Essentials
▼These two libraries are the foundation of Data Science in Python. NumPy handles numerical arrays, and Pandas handles tabular data (like Excel/CSV).
Install first: pip install numpy pandas
🔢 NumPy — Numerical Python
NumPy gives you arrays — like lists but much faster for math. All elements must be the same type.
import numpy as np
# Create arrays
a = np.array([1, 2, 3, 4, 5])
print(a) # [1 2 3 4 5]
print(type(a)) # <class 'numpy.ndarray'>
# Math on entire array at once (vectorized — no loops needed!)
print(a * 2) # [ 2 4 6 8 10]
print(a + 10) # [11 12 13 14 15]
print(a ** 2) # [ 1 4 9 16 25]
# Useful functions
print(np.mean(a)) # 3.0 (average)
print(np.sum(a)) # 15 (total)
print(np.std(a)) # 1.41 (standard deviation)
print(np.max(a)) # 5
print(np.min(a)) # 1
Creating Special Arrays
print(np.zeros(5)) # [0. 0. 0. 0. 0.]
print(np.ones(3)) # [1. 1. 1.]
print(np.arange(0, 10, 2)) # [0 2 4 6 8]
print(np.linspace(0, 1, 5)) # [0. 0.25 0.5 0.75 1.]
print(np.random.randint(1, 100, 5)) # 5 random ints
2D Arrays (Matrices)
matrix = np.array([
[1, 2, 3],
[4, 5, 6],
[7, 8, 9]
])
print(matrix.shape) # (3, 3) — 3 rows, 3 cols
print(matrix[0]) # [1 2 3] — first row
print(matrix[1, 2]) # 6 — row 1, col 2
print(matrix.sum()) # 45 — sum of all
print(matrix.sum(axis=0)) # [12 15 18] — column sums
print(matrix.sum(axis=1)) # [ 6 15 24] — row sums
💡 Why NumPy over lists? NumPy is 50x faster for math because it's written in C under the hood and operates on entire arrays at once (no Python loops).
🐼 Pandas — DataFrames for Tabular Data
Pandas gives you DataFrames — like Excel spreadsheets in Python. Each column is a Series, and the whole table is a DataFrame.
import pandas as pd
# Create a DataFrame from a dictionary
data = {
"Name": ["Alice", "Bob", "Charlie", "Diana"],
"Age": [25, 30, 22, 28],
"City": ["Mumbai", "Delhi", "Hyderabad", "Chennai"],
"Score": [92, 78, 88, 95]
}
df = pd.DataFrame(data)
print(df)
Name Age City Score
0 Alice 25 Mumbai 92
1 Bob 30 Delhi 78
2 Charlie 22 Hyderabad 88
3 Diana 28 Chennai 95
Exploring Data
print(df.head()) # first 5 rows
print(df.tail(2)) # last 2 rows
print(df.shape) # (4, 4) — 4 rows, 4 cols
print(df.columns) # column names
print(df.describe()) # statistics (mean, std, min, max)
print(df.info()) # data types and null counts
Selecting Data
# Select a column (returns a Series)
print(df["Name"])
# Select multiple columns
print(df[["Name", "Score"]])
# Filter rows — students with score > 85
top_students = df[df["Score"] > 85]
print(top_students)
# Name Age City Score
# 0 Alice 25 Mumbai 92
# 2 Charlie 22 Hyderabad 88
# 3 Diana 28 Chennai 95
Adding & Modifying Columns
# Add a new column
df["Pass"] = df["Score"] >= 80
# Calculated column
df["Grade"] = df["Score"].apply(
lambda x: "A" if x >= 90 else "B" if x >= 80 else "C"
)
print(df)
Name Age City Score Pass Grade
0 Alice 25 Mumbai 92 True A
1 Bob 30 Delhi 78 False C
2 Charlie 22 Hyderabad 88 True B
3 Diana 28 Chennai 95 True A
📝 Reading CSV Files — The Most Common Use
First, create the CSV file using %%writefile in a Jupyter cell:
%%writefile students.csv
Name,Age,City,Score,Subject
Alice,25,Mumbai,92,Math
Bob,30,Delhi,78,Science
Charlie,22,Hyderabad,88,Math
Diana,28,Chennai,95,Science
Eve,24,Mumbai,67,English
Frank,26,Delhi,91,Math
Grace,21,Hyderabad,83,English
Hiro,29,Chennai,74,Science
Ivy,23,Mumbai,96,Math
Jay,27,Delhi,59,English
Kavya,22,Hyderabad,88,Science
Leo,25,Chennai,72,Math
Meera,24,Mumbai,85,English
Nikhil,28,Delhi,93,Science
Olivia,21,Hyderabad,77,Math
Running this cell creates students.csv with 15 students. Now load and explore it:
import pandas as pd
# Read the CSV file into a DataFrame
df = pd.read_csv("students.csv")
# Quick look
print(df.head()) # first 5 rows
print(df.shape) # (15, 5) — 15 rows, 5 columns
print(df.describe()) # statistics for numeric columns
print(df.info()) # data types and null check
# Filter — students who scored above 90
toppers = df[df["Score"] > 90]
print(toppers)
# Sort by score (highest first)
sorted_df = df.sort_values("Score", ascending=False)
print(sorted_df.head())
# Save filtered results to a new CSV
toppers.to_csv("toppers.csv", index=False)
This is the core Data Science workflow: create/load data → explore → filter → analyze → save results.
📊 Basic Statistics with Pandas
print(df["Score"].mean()) # 88.25 (average)
print(df["Score"].median()) # 90.0 (middle value)
print(df["Score"].std()) # 7.27 (spread)
print(df["Score"].max()) # 95
print(df["Score"].min()) # 78
# Group by city and get average score
print(df.groupby("City")["Score"].mean())
# Value counts — how many from each city?
print(df["City"].value_counts())
💡 NumPy vs Pandas: Use NumPy for raw number crunching (arrays, matrices, math). Use Pandas for structured/tabular data (CSV, Excel, databases). Pandas actually uses NumPy under the hood!
🎬 Mini Project — Movies Recommendation System
▼Let's put everything together! You'll build a simple movie recommendation system using Python, Pandas, and everything you've learned. We'll create a movie dataset, explore it, filter it, and build a basic recommender.
🎯 What you'll practice: File I/O, DataFrames, filtering, sorting, functions, loops, dictionaries, and string operations — all in one project!
📦 Step 1 — Create the Movie Dataset
Run this cell in Jupyter to create a CSV file with 20 movies:
%%writefile movies.csv
Title,Genre,Rating,Runtime,Year,Director
The Shawshank Redemption,Drama,9.3,142,1994,Frank Darabont
The Dark Knight,Action,9.0,152,2008,Christopher Nolan
Inception,Sci-Fi,8.8,148,2010,Christopher Nolan
Pulp Fiction,Crime,8.9,154,1994,Quentin Tarantino
Forrest Gump,Drama,8.8,142,1994,Robert Zemeckis
The Matrix,Sci-Fi,8.7,136,1999,Wachowski Sisters
Interstellar,Sci-Fi,8.6,169,2014,Christopher Nolan
The Godfather,Crime,9.2,175,1972,Francis Ford Coppola
Fight Club,Drama,8.8,139,1999,David Fincher
The Lion King,Animation,8.5,88,1994,Roger Allers
Toy Story,Animation,8.3,81,1995,John Lasseter
Avengers: Endgame,Action,8.4,181,2019,Russo Brothers
Joker,Drama,8.4,122,2019,Todd Phillips
Parasite,Thriller,8.5,132,2019,Bong Joon-ho
Spider-Man: No Way Home,Action,8.3,148,2021,Jon Watts
Dune,Sci-Fi,8.0,155,2021,Denis Villeneuve
Coco,Animation,8.4,105,2017,Lee Unkrich
The Prestige,Thriller,8.5,130,2006,Christopher Nolan
Whiplash,Drama,8.5,107,2014,Damien Chazelle
Up,Animation,8.3,96,2009,Pete Docter
This creates movies.csv with 20 movies, 6 columns. Now let's explore!
🔍 Step 2 — Load & Explore the Data
import pandas as pd
df = pd.read_csv("movies.csv")
# First look
print(df.head())
print(f"\nTotal movies: {len(df)}")
print(f"Columns: {list(df.columns)}")
print(f"\nData types:\n{df.dtypes}")
print(f"\nBasic stats:\n{df.describe()}")
💡 Always start by exploring! Before any analysis, check: How many rows? What columns? Any missing data? What are the data types? This is called EDA (Exploratory Data Analysis).
❓ Step 3 — Answer Questions with Code
Now let's answer real questions about our movies. Try each one yourself before looking at the solution!
🎬 Q1: Get all movies with runtime more than 90 minutes
Filter the DataFrame to show only movies longer than 90 minutes.
# Filter: runtime > 90 minutes
long_movies = df[df["Runtime"] > 90]
print(f"Movies longer than 90 min: {len(long_movies)} out of {len(df)}")
print(long_movies[["Title", "Runtime", "Genre"]])
Movies longer than 90 min: 17 out of 20
Title Runtime Genre
0 The Shawshank Redemption 142 Drama
1 The Dark Knight 152 Action
2 Inception 148 Sci-Fi
... (and 14 more)
Only The Lion King (88 min) and Toy Story (81 min) are under 90 minutes.
⭐ Q2: Which movies have a rating above 8.5?
# Top-rated movies (rating > 8.5)
top_rated = df[df["Rating"] > 8.5].sort_values("Rating", ascending=False)
print(top_rated[["Title", "Rating", "Year"]])
Title Rating Year
The Shawshank Redemption 9.3 1994
The Godfather 9.2 1972
The Dark Knight 9.0 2008
Pulp Fiction 8.9 1994
Inception 8.8 2010
Forrest Gump 8.8 1994
Fight Club 8.8 1999
The Matrix 8.7 1999
Interstellar 8.6 2014
🎬 Q3: How many movies per genre?
# Count movies by genre
genre_counts = df["Genre"].value_counts()
print(genre_counts)
Drama 5
Sci-Fi 4
Animation 4
Action 3
Crime 2
Thriller 2
🎥 Q4: Find all Christopher Nolan movies
# Filter by director
nolan = df[df["Director"] == "Christopher Nolan"]
print(f"Nolan has {len(nolan)} movies in our dataset:\n")
print(nolan[["Title", "Genre", "Rating", "Year"]])
Nolan has 4 movies in our dataset:
Title Genre Rating Year
The Dark Knight Action 9.0 2008
Inception Sci-Fi 8.8 2010
Interstellar Sci-Fi 8.6 2014
The Prestige Thriller 8.5 2006
📅 Q5: Movies from the 90s (1990–1999) sorted by rating
# Filter by year range
movies_90s = df[(df["Year"] >= 1990) & (df["Year"] <= 1999)]
movies_90s = movies_90s.sort_values("Rating", ascending=False)
print(f"90s movies ({len(movies_90s)} found):\n")
print(movies_90s[["Title", "Rating", "Year"]])
90s movies (6 found):
Title Rating Year
The Shawshank Redemption 9.3 1994
Pulp Fiction 8.9 1994
Forrest Gump 8.8 1994
Fight Club 8.8 1999
The Matrix 8.7 1999
Toy Story 8.3 1995
📊 Q6: Average rating per genre
# Group by genre, calculate average rating
avg_by_genre = df.groupby("Genre")["Rating"].mean().sort_values(ascending=False)
print("Average rating by genre:\n")
for genre, avg in avg_by_genre.items():
print(f" {genre:12s} → {avg:.2f}")
Average rating by genre:
Crime → 9.05
Drama → 8.76
Action → 8.57
Sci-Fi → 8.53
Thriller → 8.50
Animation → 8.38
⏱️ Q7: Longest and shortest movies
# Find extremes
longest = df.loc[df["Runtime"].idxmax()]
shortest = df.loc[df["Runtime"].idxmin()]
print(f"🏆 Longest: {longest['Title']} ({longest['Runtime']} min)")
print(f"⚡ Shortest: {shortest['Title']} ({shortest['Runtime']} min)")
print(f"\n📏 Average runtime: {df['Runtime'].mean():.0f} min")
🏆 Longest: Avengers: Endgame (181 min)
⚡ Shortest: Toy Story (81 min)
📏 Average runtime: 133 min
🌟 Q8: Add a "Category" column based on rating
# Categorize movies by rating
def categorize(rating):
if rating >= 9.0:
return "🏆 Masterpiece"
elif rating >= 8.5:
return "⭐ Excellent"
elif rating >= 8.0:
return "👍 Great"
else:
return "👌 Good"
df["Category"] = df["Rating"].apply(categorize)
print(df[["Title", "Rating", "Category"]].head(10))
Title Rating Category
The Shawshank Redemption 9.3 🏆 Masterpiece
The Dark Knight 9.0 🏆 Masterpiece
Inception 8.8 ⭐ Excellent
Pulp Fiction 8.9 ⭐ Excellent
Forrest Gump 8.8 ⭐ Excellent
The Matrix 8.7 ⭐ Excellent
Interstellar 8.6 ⭐ Excellent
The Godfather 9.2 🏆 Masterpiece
Fight Club 8.8 ⭐ Excellent
The Lion King 8.5 ⭐ Excellent
🤖 Step 4 — Build a Simple Recommender
Now let's build a function that recommends movies based on what the user likes!
# Read the dataset first
df = pd.read_csv("movies.csv")
def recommend_movies(df, genre=None, min_rating=8.0, max_runtime=999, top_n=5):
"""Recommend movies based on filters."""
result = df.copy()
# Apply filters
if genre:
result = result[result["Genre"].str.lower() == genre.lower()]
result = result[result["Rating"] >= min_rating]
result = result[result["Runtime"] <= max_runtime]
# Sort by rating and return top N
result = result.sort_values("Rating", ascending=False).head(top_n)
if len(result) == 0:
print("No movies found matching your criteria 😞")
return
print(f"\n🎬 Recommended Movies ({len(result)} found):\n")
for i, (_, movie) in enumerate(result.iterrows(), 1):
print(f" {i}. {movie['Title']}")
print(f" ⭐ {movie['Rating']} | ⏱️ {movie['Runtime']} min | 🎨 {movie['Genre']} | 📅 {movie['Year']}")
print()
Try It Out!
# 🌟 Show me the best Sci-Fi movies
recommend_movies(df, genre="Sci-Fi")
🎬 Recommended Movies (4 found):
1. Inception
⭐ 8.8 | ⏱️ 148 min | 🎨 Sci-Fi | 📅 2010
2. The Matrix
⭐ 8.7 | ⏱️ 136 min | 🎨 Sci-Fi | 📅 1999
3. Interstellar
⭐ 8.6 | ⏱️ 169 min | 🎨 Sci-Fi | 📅 2014
4. Dune
⭐ 8.0 | ⏱️ 155 min | 🎨 Sci-Fi | 📅 2021
# 🎥 Short movies (under 2 hours) with great ratings
recommend_movies(df, min_rating=8.3, max_runtime=120)
🎬 Recommended Movies (4 found):
1. The Lion King
⭐ 8.5 | ⏱️ 88 min | 🎨 Animation | 📅 1994
2. Whiplash
⭐ 8.5 | ⏱️ 107 min | 🎨 Drama | 📅 2014
3. Coco
⭐ 8.4 | ⏱️ 105 min | 🎨 Animation | 📅 2017
4. Up
⭐ 8.3 | ⏱️ 96 min | 🎨 Animation | 📅 2009
# 🤠 Best Action movies
recommend_movies(df, genre="Action", top_n=3)
🎬 Recommended Movies (3 found):
1. The Dark Knight
⭐ 9.0 | ⏱️ 152 min | 🎨 Action | 📅 2008
2. Avengers: Endgame
⭐ 8.4 | ⏱️ 181 min | 🎨 Action | 📅 2019
3. Spider-Man: No Way Home
⭐ 8.3 | ⏱️ 148 min | 🎨 Action | 📅 2021
🎯 Step 5 — Challenges (Try Yourself!)
These are open-ended — try solving them before peeking at the hints.
🚨 Challenge 1: Find movies where the title contains "The"
the_movies = df[df["Title"].str.contains("The")]
print(f"Movies with 'The' in title: {len(the_movies)}\n")
print(the_movies["Title"].tolist())
🚨 Challenge 2: Which director has the highest average rating?
director_avg = df.groupby("Director")["Rating"].mean().sort_values(ascending=False)
print("Directors by average rating:\n")
print(director_avg.head(10))
🚨 Challenge 3: Create a "Decade" column (1970s, 1990s, 2000s…)
df["Decade"] = (df["Year"] // 10) * 10
df["Decade"] = df["Decade"].apply(lambda d: f"{d}s")
print(df[["Title", "Year", "Decade"]])
# Movies per decade
print("\nMovies per decade:")
print(df["Decade"].value_counts().sort_index())
🚨 Challenge 4: Interactive recommendation (ask user for input)
# Interactive version
print("\n🎬 Movie Recommender \n")
print("Available genres:", sorted(df["Genre"].unique()))
genre = input("\nEnter genre (or press Enter for all): ").strip()
min_r = input("Minimum rating (default 8.0): ").strip()
max_t = input("Max runtime in minutes (default any): ").strip()
recommend_movies(df,
genre=genre if genre else None,
min_rating=float(min_r) if min_r else 8.0,
max_runtime=int(max_t) if max_t else 999
)
🚨 Challenge 5: Export your analysis to a report file
with open("movie_report.txt", "w") as f:
f.write("=== MOVIE ANALYSIS REPORT ===\n\n")
f.write(f"Total movies: {len(df)}\n")
f.write(f"Average rating: {df['Rating'].mean():.2f}\n")
f.write(f"Average runtime: {df['Runtime'].mean():.0f} min\n\n")
f.write("Top 5 Movies:\n")
top5 = df.sort_values("Rating", ascending=False).head(5)
for _, m in top5.iterrows():
f.write(f" - {m['Title']} ({m['Rating']})\n")
print("✅ Report saved to movie_report.txt")
🎓 What you just built:
✅ Loaded a real dataset with Pandas
✅ Explored data with head(), describe(), value_counts()
✅ Filtered, sorted, and grouped data
✅ Created new columns with apply()
✅ Built a recommendation function with parameters
✅ Exported results to files
🚀 You're now ready for Data Science! Next, try adding matplotlib to visualize your data with charts.
🏗️ Object-Oriented Programming (OOP) — Classes
▼OOP lets you create your own data types by bundling data (attributes) and behavior (methods) together into a class. Think of a class as a blueprint, and objects as the actual things built from that blueprint.
Analogy: A class is like a cookie cutter 🍪 — it defines the shape. An object is the actual cookie you make from it. You can make many cookies (objects) from one cutter (class).
Your First Class
class Student:
def __init__(self, name, age, grade):
self.name = name
self.age = age
self.grade = grade
def introduce(self):
return f"Hi, I'm {self.name}, age {self.age}, grade {self.grade}"
def is_passing(self):
return self.grade >= 50
s1 = Student("Alice", 25, 92)
s2 = Student("Bob", 22, 45)
print(s1.introduce()) # Hi, I'm Alice, age 25, grade 92
print(s2.is_passing()) # False
print(s1.name) # Alice
Key Concepts
📌 self
self refers to the current object. Every method's first parameter must be self.
🔧 __init__
The constructor — runs automatically when you create a new object.
📦 Attributes
Variables attached to an object (self.name). Each object has its own copy.
⚡ Methods
Functions inside a class. Called with object.method().
Inheritance
class Animal:
def __init__(self, name, sound):
self.name = name
self.sound = sound
def speak(self):
return f"{self.name} says {self.sound}!"
class Dog(Animal):
def __init__(self, name, breed):
super().__init__(name, "Woof")
self.breed = breed
def fetch(self):
return f"{self.name} fetches the ball! 🎾"
dog = Dog("Buddy", "Golden Retriever")
print(dog.speak()) # Buddy says Woof!
print(dog.fetch()) # Buddy fetches the ball! 🎾
📝 Practical Example — Bank Account
class BankAccount:
def __init__(self, owner, balance=0):
self.owner = owner
self.balance = balance
def deposit(self, amount):
if amount > 0:
self.balance += amount
print(f"✅ Deposited ₹{amount}. Balance: ₹{self.balance}")
def withdraw(self, amount):
if amount > self.balance:
print("❌ Insufficient funds!")
else:
self.balance -= amount
print(f"✅ Withdrew ₹{amount}. Balance: ₹{self.balance}")
acc = BankAccount("Alice", 1000)
acc.deposit(500) # ✅ Deposited ₹500. Balance: ₹1500
acc.withdraw(200) # ✅ Withdrew ₹200. Balance: ₹1300
acc.withdraw(5000) # ❌ Insufficient funds!
💡 When to use classes? When you have multiple related pieces of data + actions that go together.
🏋️ Practice Exercises
▼Try these exercises to test your understanding. Click each to reveal the solution.
Exercise 1: Temperature Converter
def celsius_to_fahrenheit(celsius):
return (celsius * 9/5) + 32
temp_c = float(input("Enter temperature in Celsius: "))
print(f"{temp_c}°C = {celsius_to_fahrenheit(temp_c)}°F")
Exercise 2: Even or Odd Checker
num = int(input("Enter a number: "))
print(f"{num} is {'Even ✅' if num % 2 == 0 else 'Odd'}")
Exercise 3: FizzBuzz
for i in range(1, 21):
if i % 15 == 0: print("FizzBuzz")
elif i % 3 == 0: print("Fizz")
elif i % 5 == 0: print("Buzz")
else: print(i)
Exercise 4: List Statistics (without built-ins)
numbers = [12, 5, 8, 23, 3, 17, 9]
total = 0
minimum = maximum = numbers[0]
for num in numbers:
total += num
if num < minimum: minimum = num
if num > maximum: maximum = num
print(f"Sum:{total} Avg:{total/len(numbers)} Min:{minimum} Max:{maximum}")
Exercise 5: Student Grade Dictionary
students = {"Alice": 92, "Bob": 78, "Charlie": 95}
topper = max(students, key=students.get)
print(f"🏆 Topper: {topper} with {students[topper]} marks")
Exercise 6: Word Counter
sentence = "the cat sat on the mat the cat smiled"
word_count = {}
for word in sentence.split():
word_count[word] = word_count.get(word, 0) + 1
for w, c in word_count.items():
print(f"{w}: {c}")
Quick Cheat Sheet
| Concept | Syntax | Example |
|---|---|---|
| Variable | name = value | age = 25 |
print() | print("Hello") | |
| Input | input() | name = input("Name: ") |
| If/Else | if condition: | if x > 0: print("+") |
| For loop | for x in seq: | for i in range(5): |
| While loop | while cond: | while x < 10: |
| Function | def name(): | def greet(n): return f"Hi {n}" |
| List | [a, b, c] | nums = [1, 2, 3] |
| Dict | {k: v} | d = {"name": "Alice"} |
| f-String | f"text {var}" | f"Age: {age}" |
🎉 Congratulations! You've covered all the Python fundamentals!
🚀 You made it! From variables to a full movie recommendation system.