Python requests库教程：从GET请求到Session会话管理完整指南

📝 841 字 · ☕ 3 分钟阅读

前言

Python 的 requests 库是 Python 生态中最受欢迎的第三方 HTTP 库，被誉为”给人类使用的 HTTP 库”。无论是调用 RESTful API、爬取网页数据，还是实现自动化办公中的网络请求，requests 都是首选工具。本教程将从安装开始，带你一步步掌握 requests 库的核心用法，包含大量可直接运行的代码示例。

1. 安装与环境准备

requests 库的安装非常简单，一行命令即可搞定：

pip install requests

验证安装是否成功：

import requests
print(requests.__version__)  # 输出示例：2.31.0

建议搭配 venv 虚拟环境使用，避免与全局包冲突：

python -m venv myenv
source myenv/bin/activate  # Linux/macOS
myenv\Scripts\activate     # Windows
pip install requests

2. 发送 GET 请求

GET 请求是最基本的 HTTP 方法，用于获取资源。requests 库让 GET 请求变得极其简洁：

import requests

# 最简单的 GET 请求
response = requests.get('https://api.github.com')
print(response.status_code)  # 200
print(response.text[:200])   # 响应内容的前200个字符

带参数的 GET 请求

通过 params 参数传递 URL 查询参数，requests 会自动编码：

import requests

# 搜索 GitHub 上的 Python 仓库
url = 'https://api.github.com/search/repositories'
params = {
    'q': 'language:python',
    'sort': 'stars',
    'per_page': 3
}
response = requests.get(url, params=params)
data = response.json()

for repo in data['items']:
    print(f"{repo['full_name']} - ⭐ {repo['stargazers_count']}")
# 输出示例：
# tensorflow/tensorflow - ⭐ 188000+
# django/django - ⭐ 82000+
# pallets/flask - ⭐ 69000+

自定义请求头

import requests

headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36',
    'Accept': 'application/json',
    'Authorization': 'Bearer your_token_here'
}
response = requests.get('https://api.github.com/user', headers=headers)
print(response.json())

3. 发送 POST 请求

POST 请求用于向服务器提交数据，比如创建资源、提交表单等：

import requests

# 发送 JSON 数据（最常见的场景）
url = 'https://jsonplaceholder.typicode.com/posts'
data = {
    'title': 'Python requests 教程',
    'body': '这是一篇关于 requests 库的教程',
    'userId': 1
}
response = requests.post(url, json=data)
print(f"状态码: {response.status_code}")  # 201 Created
print(f"创建的资源 ID: {response.json()['id']}")  # 101

发送表单数据

import requests

# 模拟表单提交
url = 'https://httpbin.org/post'
form_data = {'username': 'admin', 'password': 'secret123'}
response = requests.post(url, data=form_data)
print(response.json()['form'])
# 输出: {'username': 'admin', 'password': 'secret123'}

上传文件

import requests

# 上传单个文件
files = {'file': open('report.xlsx', 'rb')}
response = requests.post('https://httpbin.org/post', files=files)
print(response.json()['files'])

# 上传多个文件
files = {
    'file1': ('report.pdf', open('report.pdf', 'rb'), 'application/pdf'),
    'file2': ('image.png', open('image.png', 'rb'), 'image/png')
}
response = requests.post('https://httpbin.org/post', files=files)

4. Session 会话管理

Session 对象可以跨请求保持 cookies 和请求头，是登录态爬虫和 API 调用的核心工具：

import requests

# 创建 Session 对象
session = requests.Session()

# 设置全局请求头 —— 所有通过此 session 发起的请求都会携带
session.headers.update({
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36'
})

# 先登录（假设一个登录接口）
login_url = 'https://httpbin.org/post'
login_data = {'username': 'testuser', 'password': '123456'}
session.post(login_url, data=login_data)

# 后续请求自动携带登录后的 cookies
profile = session.get('https://httpbin.org/cookies')
print(profile.json())  # 包含登录后的 cookie 信息

# 关闭 Session
session.close()

# 推荐使用上下文管理器，自动关闭
with requests.Session() as s:
    s.get('https://httpbin.org/get')
    # 自动关闭连接

Session 的典型应用场景

import requests

# 模拟知乎登录后的爬取
session = requests.Session()
session.headers.update({'User-Agent': 'Mozilla/5.0'})

# 第一步：获取登录页面，提取 XSRF token
login_page = session.get('https://www.zhihu.com/signin')
# ... 解析 token（简化处理）

# 第二步：提交登录
session.post('https://www.zhihu.com/api/v3/oauth/sign_in',
    json={'username': 'xxx', 'password': 'xxx'})

# 第三步：用同一 session 访问需要登录的页面
response = session.get('https://www.zhihu.com/api/v4/me')
print(response.status_code)  # 200 表示已登录成功

5. 响应处理详解

requests 返回的 Response 对象提供了丰富的属性和方法：

import requests

response = requests.get('https://api.github.com')

# 常用属性
print(f"状态码: {response.status_code}")          # 200
print(f"响应头: {response.headers['Content-Type']}")  # application/json; charset=utf-8
print(f"编码: {response.encoding}")                # utf-8
print(f"请求耗时: {response.elapsed.total_seconds():.3f}s")  # 0.452s

# 常用方法
print(response.text)       # 原始字符串响应体
print(response.content)    # 二进制响应体（适合图片/文件）
print(response.json())     # 解析为 JSON 字典（最常用）

# 检查响应状态
print(response.ok)         # True (status_code < 400)
print(response.is_redirect)  # 是否为重定向
print(response.url)        # 最终的 URL（跟踪重定向后）

6. 超时与错误处理

生产环境一定要设置超时，否则请求可能永久挂起：

import requests
from requests.exceptions import RequestException, Timeout, ConnectionError

try:
    # 设置超时：连接超时3秒，读取超时10秒
    response = requests.get('https://httpbin.org/delay/5',
                            timeout=(3, 10))
    response.raise_for_status()  # 状态码非200时抛出 HTTPError
    print(response.json())

except Timeout:
    print("❌ 请求超时，请检查网络或目标服务器")
except ConnectionError:
    print("❌ 连接失败，目标服务器不可达")
except RequestException as e:
    print(f"❌ 请求错误: {e}")

# 更简洁的处理方式
try:
    resp = requests.get('https://api.github.com', timeout=10)
    resp.raise_for_status()
except Exception as e:
    print(f"请求失败: {e}")
else:
    print(f"✅ 请求成功，状态码: {resp.status_code}")

7. 请求重试与适配器

使用 HTTPAdapter 可以设置自动重试，应对网络波动：

import requests
from requests.adapters import HTTPAdapter
from urllib3.util.retry import Retry

# 创建 Session
session = requests.Session()

# 配置重试策略
retry_strategy = Retry(
    total=3,                    # 最多重试3次
    backoff_factor=1,           # 重试间隔：1s, 2s, 4s
    status_forcelist=[500, 502, 503, 504],  # 哪些状态码触发重试
    allowed_methods=['GET', 'POST']  # 哪些方法可以重试
)

adapter = HTTPAdapter(max_retries=retry_strategy)
session.mount('https://', adapter)
session.mount('http://', adapter)

# 即使服务器返回 503，也会自动重试
response = session.get('https://httpbin.org/status/503')
print(f"最终状态码: {response.status_code}")

8. 实战：天气 API 查询助手

综合运用以上知识，编写一个命令行天气查询工具：

import requests
from datetime import datetime

def get_weather(city: str, api_key: str = "demo_key"):
    """查询指定城市的实时天气"""
    url = "https://api.openweathermap.org/data/2.5/weather"
    params = {
        'q': city,
        'appid': api_key,
        'units': 'metric',  # 摄氏度
        'lang': 'zh_cn'
    }

    try:
        response = requests.get(url, params=params, timeout=10)
        response.raise_for_status()
        data = response.json()

        weather_info = {
            '城市': data['name'],
            '天气': data['weather'][0]['description'],
            '温度': f"{data['main']['temp']}°C",
            '体感温度': f"{data['main']['feels_like']}°C",
            '湿度': f"{data['main']['humidity']}%",
            '风速': f"{data['wind']['speed']} m/s",
            '更新时间': datetime.fromtimestamp(data['dt']).strftime('%Y-%m-%d %H:%M')
        }

        print(f"\n🌤️  {city} 实时天气报告")
        print("=" * 30)
        for key, value in weather_info.items():
            print(f"{key}: {value}")

    except requests.exceptions.HTTPError as e:
        print(f"❌ HTTP 错误: {e}")
    except requests.exceptions.ConnectionError:
        print("❌ 无法连接到天气服务器，请检查网络")
    except requests.exceptions.Timeout:
        print("❌ 请求超时")
    except Exception as e:
        print(f"❌ 未知错误: {e}")

if __name__ == '__main__':
    get_weather('Beijing')

9. 常见问题与避坑指南

# ❌ 常见错误1：忘记设置编码导致乱码
response = requests.get('https://example.com')
# response.encoding 可能为 ISO-8859-1
# ✅ 解决方法：手动指定编码
response.encoding = 'utf-8'
print(response.text)

# ❌ 常见错误2：未设置 User-Agent 被反爬
response = requests.get('https://some-website.com')
# 返回 403 或空内容
# ✅ 解决方法：模拟浏览器头
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36'}
response = requests.get('https://some-website.com', headers=headers)

# ❌ 常见错误3：忘记设置 timeout 导致程序卡死
response = requests.get('https://slow-server.com')  # 可能卡住几分钟
# ✅ 解决方法：始终设置超时
response = requests.get('https://slow-server.com', timeout=5)

FAQ（常见问题）

Q1：requests 和 urllib 有什么区别？哪个更好？

A：requests 是基于 urllib3 的高级封装，API 设计更人性化。具体对比：requests 一行代码搞定 GET 请求（requests.get()），而 urllib 需要 urllib.request.urlopen() + 额外处理。requests 自动处理 JSON 解析、Session 管理、连接池、SSL 验证等。除非有特殊需求，否则优先使用 requests。

Q2：为什么 requests.get() 返回的是乱码？

A：requests 会从响应头 Content-Type 猜测编码，但有些网站返回的 charset 不正确。解决方法：response.encoding = 'utf-8' 或 response.apparent_encoding（基于内容自动检测）。建议显示指定 encoding='utf-8'。

Q3：使用 Session 时，什么时候需要手动关闭连接？

A：使用上下文管理器（with requests.Session() as s:）时不需要手动关闭，程序退出时自动释放。如果手动创建 Session，建议用完调用 session.close()。另外，Session 会保存 cookies 和连接池，频繁创建新 Session 而不是复用会导致性能下降，推荐在需要登录态保持的场景下始终使用 Session。

📤 分享这篇文章

微博 Twitter LinkedIn

📖 你可能还喜欢

📄 Python asyncio教程：从协程基础到异步爬虫实战完整指南 📄 Linux find命令用法详解：文件查找的12个实战场景（2026）📄 GitHub Copilot使用技巧：90%的人不知道的10个隐藏功能（2026）📄 Harness Engineering：当软件交付本身也变成了一行代码