Python Libraries for Finance
Essential packages for financial data science
Python Libraries Reference
This page provides a practical guide to the Python libraries used across the course materials, organised by functional area.
📊 Core Data Science Stack
pandas - Data Manipulation and Analysis
import pandas as pd
# Essential for all financial data work
df = pd.read_csv('financial_data.csv')
returns = df['price'].pct_change()- Purpose: Data manipulation, cleaning, and analysis
- Key Features: DataFrames, time series handling, groupby operations
- Documentation: pandas.pydata.org
- Finance Use: Portfolio analysis, return calculations, data cleaning
NumPy - Numerical Computing
import numpy as np
# Mathematical operations and array computing
portfolio_weights = np.array([0.3, 0.4, 0.3])
portfolio_return = np.dot(asset_returns, portfolio_weights)- Purpose: Numerical computing and mathematical operations
- Key Features: Arrays, linear algebra, random number generation
- Documentation: numpy.org
- Finance Use: Mathematical calculations, matrix operations, simulations
📈 Financial Data Libraries
yfinance - Yahoo Finance Data
import yfinance as yf
# Download stock data
data = yf.download('AAPL', start='2020-01-01', end='2024-01-01')
ticker_info = yf.Ticker('AAPL').info- Purpose: Free access to Yahoo Finance data
- Key Features: Historical prices, company info, financial statements
- Documentation: GitHub
- Finance Use: Stock prices, market data, fundamental analysis
pandas-datareader - Multiple Data Sources
import pandas_datareader as pdr
# Access various financial data sources
fed_data = pdr.get_data_fred('GDP', start='2020-01-01')- Purpose: Access multiple financial data providers
- Key Features: FRED, World Bank, Yahoo Finance, Alpha Vantage
- Documentation: pandas-datareader.readthedocs.io
- Finance Use: Economic data, international markets, alternative datasets
QuantLib-Python - Quantitative Finance
import QuantLib as ql
# Advanced financial calculations
option = ql.EuropeanOption(payoff, exercise)- Purpose: Advanced quantitative finance calculations
- Key Features: Options pricing, fixed income, risk management
- Documentation: quantlib.org
- Finance Use: Derivatives pricing, yield curve modelling, risk calculations
🎨 Visualization Libraries
matplotlib - Static Plotting
import matplotlib.pyplot as plt
# Create professional financial charts
plt.figure(figsize=(12, 8))
plt.plot(dates, prices, linewidth=2)
plt.title('Stock Price Analysis')- Purpose: Static plotting and chart creation
- Key Features: Line plots, histograms, subplots, customization
- Documentation: matplotlib.org
- Finance Use: Price charts, return distributions, correlation plots
seaborn - Statistical Visualization
import seaborn as sns
# Statistical plots with better defaults
sns.heatmap(correlation_matrix, annot=True, cmap='coolwarm')- Purpose: Statistical visualization with better aesthetics
- Key Features: Heatmaps, distribution plots, regression plots
- Documentation: seaborn.pydata.org
- Finance Use: Correlation analysis, distribution comparisons, regression analysis
plotly - Interactive Visualization
import plotly.graph_objects as go
import plotly.express as px
# Interactive financial dashboards
fig = go.Figure(data=go.Candlestick(x=df.index, open=df['Open'],
high=df['High'], low=df['Low'], close=df['Close']))- Purpose: Interactive and web-ready visualizations
- Key Features: Candlestick charts, 3D plots, animations, dashboards
- Documentation: plotly.com/python
- Finance Use: Interactive dashboards, real-time charts, presentations
🤖 Machine Learning Libraries
scikit-learn - Traditional ML
from sklearn.ensemble import RandomForestRegressor
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
# Implement ML models for finance
model = RandomForestRegressor(n_estimators=100)
model.fit(X_train, y_train)- Purpose: Traditional machine learning algorithms
- Key Features: Classification, regression, clustering, model selection
- Documentation: scikit-learn.org
- Finance Use: Price prediction, risk modelling, portfolio optimisation
XGBoost - Gradient Boosting
import xgboost as xgb
# High-performance gradient boosting
model = xgb.XGBRegressor(n_estimators=100, learning_rate=0.1)
model.fit(X_train, y_train)- Purpose: Advanced gradient boosting algorithms
- Key Features: High performance, feature importance, early stopping
- Documentation: xgboost.readthedocs.io
- Finance Use: Credit scoring, fraud detection, market prediction
TensorFlow - Deep Learning
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense
# Build neural networks for finance
model = Sequential([
LSTM(50, return_sequences=True),
LSTM(50),
Dense(1)
])- Purpose: Deep learning and neural networks
- Key Features: LSTM, CNN, RNN, automatic differentiation
- Documentation: tensorflow.org
- Finance Use: Time series prediction, algorithmic trading, pattern recognition
📈 specialised Financial Libraries
arch - GARCH Models
from arch import arch_model
# Volatility modelling
model = arch_model(returns, vol='Garch', p=1, q=1)
fitted_model = model.fit()- Purpose: ARCH and GARCH volatility modelling
- Key Features: Multiple GARCH specifications, forecasting, diagnostics
- Documentation: arch.readthedocs.io
- Finance Use: Volatility forecasting, risk management, options pricing
statsmodels - Econometrics
import statsmodels.api as sm
from statsmodels.tsa.arima.model import ARIMA
# Econometric analysis
model = ARIMA(data, order=(1,1,1))
fitted_model = model.fit()- Purpose: Statistical modelling and econometrics
- Key Features: ARIMA, regression, time series analysis, hypothesis testing
- Documentation: statsmodels.org
- Finance Use: Time series modelling, regression analysis, statistical testing
zipline - Backtesting Framework
import zipline
from zipline.api import order, record, symbol
# Professional backtesting
def initialize(context):
context.asset = symbol('AAPL')
def handle_data(context, data):
order(context.asset, 100)- Purpose: Professional algorithmic trading backtesting
- Key Features: Realistic trading simulation, performance analytics
- Documentation: zipline.io
- Finance Use: Strategy backtesting, performance evaluation, risk analysis
🧠 AI and NLP Libraries
transformers - Modern NLP
from transformers import pipeline
# Financial sentiment analysis
finbert = pipeline("sentiment-analysis", model="ProsusAI/finbert")
sentiment = finbert("Company reports strong quarterly earnings")- Purpose: State-of-the-art natural language processing
- Key Features: Pre-trained models, FinBERT, BERT, GPT integration
- Documentation: huggingface.co/transformers
- Finance Use: Sentiment analysis, document processing, automated research
TextBlob - Simple NLP
from textblob import TextBlob
# Quick sentiment analysis
text = "Apple stock surges after earnings beat"
sentiment = TextBlob(text).sentiment.polarity- Purpose: Simple and intuitive NLP operations
- Key Features: Sentiment analysis, part-of-speech tagging, noun phrases
- Documentation: textblob.readthedocs.io
- Finance Use: Basic sentiment analysis, text preprocessing
OpenAI - Generative AI
import openai
# AI-powered financial analysis
response = openai.ChatCompletion.create(
model="gpt-3.5-turbo",
messages=[{"role": "user", "content": "analyse this company's financials..."}]
)- Purpose: Access to GPT models for generative AI
- Key Features: Text generation, analysis, summarisation
- Documentation: platform.openai.com
- Finance Use: Automated research, report generation, analysis assistance
🔧 Development and Deployment
Flask - Web Applications
from flask import Flask, jsonify, request
# Build financial APIs
app = Flask(__name__)
@app.route('/api/predict', methods=['POST'])
def predict():
# ML model prediction endpoint
return jsonify({'prediction': result})- Purpose: Web application development
- Key Features: REST APIs, web interfaces, microservices
- Documentation: flask.palletsprojects.com
- Finance Use: Trading APIs, portfolio dashboards, client interfaces
Streamlit - Data Applications
import streamlit as st
# Create financial dashboards
st.title('Portfolio Analysis Dashboard')
ticker = st.selectbox('Select Stock', ['AAPL', 'GOOGL', 'MSFT'])- Purpose: Rapid development of data applications
- Key Features: Interactive widgets, real-time updates, easy deployment
- Documentation: streamlit.io
- Finance Use: Prototyping dashboards, client demos, internal tools
📚 installlation Guide
Complete Environment Setup
# Create conda environment
conda create -n fin510 python=3.9
conda activate fin510
# install core packages
conda install pandas numpy matplotlib seaborn plotly
conda install scikit-learn jupyter jupyterlab
# install financial packages
pip install yfinance pandas-datareader quantlib-python
pip install arch statsmodels scipy
# install ML/AI packages
pip install xgboost lightgbm tensorflow
pip install transformers openai textblob
# install development tools
pip install flask streamlit requests beautifulsoup4Verification Script
# Run this to verify your installlation
required_packages = [
'pandas', 'numpy', 'matplotlib', 'seaborn', 'plotly',
'sklearn', 'yfinance', 'arch', 'statsmodels',
'tensorflow', 'xgboost', 'textblob', 'transformers'
]
import importlib
missing = []
for package in required_packages:
try:
importlib.import_module(package)
print(f"✓ {package}")
except ImportError:
print(f"✗ {package}")
missing.append(package)
if not missing:
print("\n🎉 All packages installled successfully!")
else:
print(f"\n❌ Missing: {', '.join(missing)}")🆘 Troubleshooting
Common Issues
Package installlation Failures
# Try different installlation methods
pip install package_name
conda install package_name
conda install -c conda-forge package_nameSSL Certificate Errors
# Add to scripts that use yfinance
import ssl
ssl._create_default_https_context = ssl._create_unverified_contextMemory Issues
# optimise for large datasets
import pandas as pd
pd.set_option('mode.chained_assignment', None)
# Use chunking for large files
for chunk in pd.read_csv('large_file.csv', chunksize=10000):
process(chunk)Getting Help
- Course Discussion Board: Technical questions and peer support
- Office Hours: By appointment with Professor Quinn
- IT Support: 028 9536 7188 for hardware/software issues
- Online Resources: Stack Overflow, library documentation
This reference will be updated throughout the course as we introduce new libraries and techniques.