Skip to content

Commit ddaa28a

Browse files
authored
refactor: major database overhaul (#65)
* feat: add BackupModel, PadModel, and UserModel for database schema - Introduced BackupModel to manage backups with relationships to PadModel. - Created PadModel to represent pads with relationships to UserModel and BackupModel. - Added UserModel to handle user data and relationships with PadModel. - Each model includes fields for UUID, timestamps, and relevant data types, utilizing SQLAlchemy for ORM functionality. * feat: introduce base model and refactor existing models for database schema - Added BaseModel to centralize common fields and schema configuration for all models. - Refactored BackupModel, PadModel, and UserModel to inherit from BaseModel, enhancing code reusability. - Updated relationships and added indexing for improved query performance. - Implemented to_dict methods in models for easier data serialization. * feat: implement repository modules for database operations - Added a new repository module for database operations, including UserRepository, PadRepository, and BackupRepository. - Each repository provides methods for creating, retrieving, updating, and deleting records related to users, pads, and backups. - Enhanced modularity and organization of database interactions within the application. * feat: add service modules for user, pad, and backup management - Introduced UserService, PadService, and BackupService to handle business logic related to users, pads, and backups. - Each service includes methods for creating, retrieving, updating, and deleting records, enhancing modularity and organization of the application. - Added an __init__.py file to facilitate service module imports. * feat: implement database module with async support - Added a new database module to manage database connections and session handling using SQLAlchemy with async capabilities. - Introduced init_db function to initialize the database schema and tables. - Created repository and service dependency functions for user, pad, and backup management. - Updated requirements.txt to include psycopg2-binary for PostgreSQL support. * refactor: update SQLAlchemy column types in UserModel - Changed VARCHAR to String for username and email fields in UserModel to align with SQLAlchemy best practices. - This update enhances code consistency and readability within the database model. * refactor: update models to use schema configuration - Refactored BackupModel, PadModel, and UserModel to utilize SCHEMA_NAME for schema configuration, enhancing consistency across models. - Removed the get_schema method from BaseModel to streamline schema handling. - Updated ForeignKey references to align with the new schema approach, improving clarity and maintainability. * refactor: update UUID type usage in BackupModel and PadModel - Replaced custom UUIDType with SQLAlchemy's built-in UUID type for source_id and owner_id fields in BackupModel and PadModel, respectively. - This change enhances code clarity and aligns with SQLAlchemy best practices for UUID handling. * refactor: centralize schema configuration and update database initialization - Introduced SCHEMA_NAME in base_model.py for consistent schema handling across models. - Updated init_db function to use CreateSchema for schema creation, improving clarity and maintainability. - Adjusted imports in models to reflect the new schema configuration, enhancing code organization. * refactor: update UserModel and UserRepository to use UUID for user IDs - Modified UserModel to use SQLAlchemy's UUID type for the primary key, aligning with Keycloak's UUID requirements. - Updated UserRepository's create method to accept a user_id parameter, allowing for explicit user ID assignment during user creation. * feat: enhance user model and repository for additional user attributes - Updated UserModel to include new fields: email_verified, name, given_name, family_name, and roles, allowing for more comprehensive user data management. - Modified UserRepository's create method to accept these new fields, facilitating their inclusion during user creation. - Introduced a new user router to handle user creation and retrieval endpoints, improving API functionality and user management capabilities. * feat: add JWT token handling and user management dependencies - Introduced functions for decoding JWT tokens and retrieving current user information, enhancing authentication flow. - Implemented user creation logic for new users based on token data, improving user management capabilities. - Added an admin role check to enforce access control, ensuring only authorized users can access certain resources. * feat: add TemplatePadModel for managing template pads - Introduced TemplatePadModel to represent the template pads table in the app schema, enhancing the database structure. - Defined columns for name, display_name, and data, ensuring comprehensive data management for template pads. - Added an index on display_name for improved query performance. * feat: add TemplatePad repository, service, and router for template pad management - Introduced TemplatePadRepository for database operations related to template pads, including create, read, update, and delete functionalities. - Added TemplatePadService to encapsulate business logic for template pad management, ensuring data validation and error handling. - Created a new router for template pad endpoints, providing API access for creating, retrieving, updating, and deleting template pads, with admin access control. - Updated existing modules to integrate the new template pad features, enhancing overall application functionality. * feat: add docstrings to user router functions for improved clarity - Added docstrings to the create_user, get_all_users, get_user_info, get_user_count, and get_user functions to clarify their purpose and access restrictions (admin only). - This enhancement improves code documentation and aids in understanding the functionality of user management endpoints. * feat: update requirements for database and file handling - Added python-multipart to requirements.txt to support file uploads in the application. - Ensured psycopg2-binary remains included for PostgreSQL database interactions, maintaining necessary dependencies for backend functionality. * feat: enhance template management and application structure - Added a new function to load templates from JSON files into the database, ensuring templates are created if they do not already exist. - Updated the lifespan of the FastAPI application to include the loading of templates during startup. - Refactored the TemplatePadRepository to update and delete templates using their name instead of ID, improving usability. - Introduced a new router for template pad endpoints, expanding API capabilities for template management. - Added a default template JSON file to provide a starting point for users, enhancing user experience. * feat: refactor authentication and user session management - Introduced UserSession class to unify user session handling, integrating authentication data with user information. - Updated AuthDependency to utilize UserSession, enhancing session validation and user data retrieval. - Replaced SessionData references with UserSession across various routers for consistent user session management. - Added new methods to UserSession for accessing user attributes and caching user data from the database. - Removed deprecated canvas router and introduced a new pad router for managing user pads and backups, improving API structure and functionality. * refactor: streamline user session handling and update API endpoints - Simplified UserSession initialization by decoding the JWT token directly, enhancing security and clarity. - Removed unused to_dict method from UserSession to reduce code complexity. - Updated AuthDependency to eliminate unnecessary user service dependency, streamlining session validation. - Corrected API endpoint in hooks.ts from '/api/user/me' to '/api/users/me' for consistency with routing. - Cleaned up imports in user_router.py to maintain code organization. * refactor: reorganize workspace routing and implement user email uniqueness - Updated import statement in main.py to reflect the new workspace_router file structure. - Modified UserModel to enforce unique email addresses for users, improving data integrity. - Introduced a new workspace_router.py file to manage workspace-related endpoints, replacing the deprecated workspace.py, streamlining the API structure. * refactor: update pad router and API endpoint consistency - Refactored pad router to streamline user pad management, including saving and retrieving canvas data. - Updated user session handling to use user.id instead of user_id for consistency. - Changed API endpoints in frontend hooks to align with new pad routing structure, enhancing clarity and usability. - Removed deprecated default canvas data retrieval function, simplifying the codebase. * refactor: update workspace state management in API and components - Modified WorkspaceState interface to include 'name' and 'id' properties, enhancing workspace identification. - Updated ActionButton and Terminal components to utilize 'name' instead of 'workspace_id' for URL generation, improving consistency across the application. - Ensured all relevant references to workspace identification are aligned with the new state structure, streamlining the codebase. * refactor: remove db.py and reorganize database initialization - Deleted db.py to streamline database management and reduce redundancy. - Updated main.py to directly import init_db from the database module, simplifying the database initialization process. - Introduced auth_router.py to handle authentication routes, enhancing modularity and organization of the codebase. - Ensured dotenv loading is handled in the database module for consistent environment variable access. * refactor: integrate CoderAPI into authentication and workspace management - Added a new dependency function to provide a CoderAPI instance for use in routers. - Updated auth_router to utilize CoderAPI in the callback endpoint, enhancing authentication flow. - Modified workspace_router to incorporate CoderAPI in workspace state retrieval and management functions, improving workspace operations. - Streamlined dependency management across routers for better modularity and code organization. * refactor: migrate CoderAPI configuration to centralized config module - Replaced dotenv loading in CoderAPI with centralized configuration from config.py, enhancing consistency and maintainability. - Updated CoderAPI initialization to use environment variables directly from the config module. - Adjusted error messages to reflect the new configuration approach, improving clarity for required variables. - Streamlined imports across various modules to utilize the new configuration structure, promoting better organization. * refactor: update authentication URL configuration for Keycloak - Replaced direct references to OIDC_CONFIG with environment variables for OIDC_SERVER_URL, OIDC_REALM, OIDC_CLIENT_ID, and OIDC_CLIENT_SECRET, enhancing configuration management. - Improved code clarity and maintainability by centralizing authentication URL generation in the config module. * refactor: enhance JWT token handling and session management - Integrated PyJWKClient for secure JWT verification, improving token validation by using signing keys from JWKS. - Updated UserSession initialization to decode tokens with verification, enhancing security and error handling. - Added a caching mechanism for the JWKS client to optimize performance. - Cleaned up token expiration checks to ensure accurate validation and error reporting. * refactor: enhance Redis connection management and backup functionality - Introduced a Redis connection pool to optimize Redis client management, improving performance and resource utilization. - Updated session management functions to utilize the new Redis connection pool, ensuring efficient access to session data. - Implemented a new method in BackupService to retrieve backups for a user's first pad using a join operation, addressing the N+1 query problem. - Enhanced pad router to create backups conditionally based on time intervals, improving backup efficiency and management. - Streamlined user router to utilize the new Redis client retrieval method, promoting better code organization and maintainability. * refactor: standardize API endpoint paths and enhance datetime handling - Updated API endpoint paths in pad, template pad, and user routers to remove trailing slashes for consistency. - Enhanced datetime handling in BackupService to include timezone information, improving accuracy in backup creation timing. - Adjusted frontend API calls to align with the updated endpoint structure, ensuring seamless integration across the application. * refactor: update CORS middleware configuration - Changed CORS middleware to allow all origins by updating the allow_origins parameter to ["*"], enhancing flexibility for cross-origin requests. * feat: add user synchronization with authentication token data - Implemented a new method in UserService to synchronize user data with information from the authentication token, creating or updating the user as necessary. - Updated the user router to utilize this new synchronization method, enhancing user data management and ensuring consistency between the database and authentication token data. * feat: implement database migration system with Alembic - Added Alembic configuration and migration scripts to facilitate database schema changes. - Implemented a run_migrations function to execute migrations during application startup. - Created migration scripts to transfer data from the old public schema to the new pad_ws schema, ensuring data integrity and consistency. - Updated requirements.txt to include Alembic as a dependency for migration management.
1 parent e13c2c9 commit ddaa28a

39 files changed

+2504
-571
lines changed

src/backend/coder.py

Lines changed: 13 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -1,26 +1,22 @@
11
import os
22
import requests
3-
from dotenv import load_dotenv
3+
from config import CODER_API_KEY, CODER_URL, CODER_TEMPLATE_ID, CODER_DEFAULT_ORGANIZATION, CODER_WORKSPACE_NAME
44

55
class CoderAPI:
66
"""
7-
A class for interacting with the Coder API using credentials from .env file
7+
A class for interacting with the Coder API using credentials from config
88
"""
99

1010
def __init__(self):
11-
# Load environment variables from .env file
12-
load_dotenv()
11+
# Get configuration from config
12+
self.api_key = CODER_API_KEY
13+
self.coder_url = CODER_URL
14+
self.template_id = CODER_TEMPLATE_ID
15+
self.default_organization_id = CODER_DEFAULT_ORGANIZATION
1316

14-
# Get configuration from environment variables
15-
self.api_key = os.getenv("CODER_API_KEY")
16-
self.coder_url = os.getenv("CODER_URL")
17-
self.user_id = os.getenv("USER_ID")
18-
self.template_id = os.getenv("CODER_TEMPLATE_ID")
19-
self.default_organization_id = os.getenv("CODER_DEFAULT_ORGANIZATION")
20-
21-
# Check if required environment variables are set
17+
# Check if required configuration variables are set
2218
if not self.api_key or not self.coder_url:
23-
raise ValueError("CODER_API_KEY and CODER_URL must be set in .env file")
19+
raise ValueError("CODER_API_KEY and CODER_URL must be set in environment variables")
2420

2521
# Set up common headers for API requests
2622
self.headers = {
@@ -56,9 +52,9 @@ def create_workspace(self, user_id, parameter_values=None):
5652
template_id = self.template_id
5753

5854
if not template_id:
59-
raise ValueError("template_id must be provided or TEMPLATE_ID must be set in .env")
55+
raise ValueError("template_id must be provided or TEMPLATE_ID must be set in environment variables")
6056

61-
name = os.getenv("CODER_WORKSPACE_NAME", "ubuntu")
57+
name = CODER_WORKSPACE_NAME
6258

6359
# Prepare the request data
6460
data = {
@@ -201,7 +197,7 @@ def get_workspace_status_for_user(self, username):
201197
Returns:
202198
dict: Workspace status data if found, None otherwise
203199
"""
204-
workspace_name = os.getenv("CODER_WORKSPACE_NAME", "ubuntu")
200+
workspace_name = CODER_WORKSPACE_NAME
205201

206202
endpoint = f"{self.coder_url}/api/v2/users/{username}/workspace/{workspace_name}"
207203
response = requests.get(endpoint, headers=self.headers)
@@ -282,4 +278,4 @@ def stop_workspace(self, workspace_id):
282278
headers['Content-Type'] = 'application/json'
283279
response = requests.post(endpoint, headers=headers, json=data)
284280
response.raise_for_status()
285-
return response.json()
281+
return response.json()

src/backend/config.py

Lines changed: 87 additions & 41 deletions
Original file line numberDiff line numberDiff line change
@@ -3,94 +3,132 @@
33
import time
44
import httpx
55
import redis
6+
from redis import ConnectionPool, Redis
67
import jwt
8+
from jwt.jwks_client import PyJWKClient
79
from typing import Optional, Dict, Any, Tuple
810
from dotenv import load_dotenv
911

12+
# Load environment variables once
1013
load_dotenv()
1114

15+
# ===== Application Configuration =====
1216
STATIC_DIR = os.getenv("STATIC_DIR")
1317
ASSETS_DIR = os.getenv("ASSETS_DIR")
18+
FRONTEND_URL = os.getenv('FRONTEND_URL')
1419

15-
OIDC_CONFIG = {
16-
'client_id': os.getenv('OIDC_CLIENT_ID'),
17-
'client_secret': os.getenv('OIDC_CLIENT_SECRET'),
18-
'server_url': os.getenv('OIDC_SERVER_URL'),
19-
'realm': os.getenv('OIDC_REALM'),
20-
'redirect_uri': os.getenv('REDIRECT_URI'),
21-
'frontend_url': os.getenv('FRONTEND_URL')
22-
}
23-
24-
# Redis connection
25-
redis_client = redis.Redis(
26-
host=os.getenv('REDIS_HOST', 'localhost'),
27-
password=os.getenv('REDIS_PASSWORD', None),
28-
port=int(os.getenv('REDIS_PORT', 6379)),
20+
MAX_BACKUPS_PER_USER = 10 # Maximum number of backups to keep per user
21+
MIN_INTERVAL_MINUTES = 5 # Minimum interval in minutes between backups
22+
DEFAULT_PAD_NAME = "Untitled" # Default name for new pads
23+
DEFAULT_TEMPLATE_NAME = "default" # Template name to use when a user doesn't have a pad
24+
25+
# ===== PostHog Configuration =====
26+
POSTHOG_API_KEY = os.getenv("VITE_PUBLIC_POSTHOG_KEY")
27+
POSTHOG_HOST = os.getenv("VITE_PUBLIC_POSTHOG_HOST")
28+
29+
# ===== OIDC Configuration =====
30+
OIDC_CLIENT_ID = os.getenv('OIDC_CLIENT_ID')
31+
OIDC_CLIENT_SECRET = os.getenv('OIDC_CLIENT_SECRET')
32+
OIDC_SERVER_URL = os.getenv('OIDC_SERVER_URL')
33+
OIDC_REALM = os.getenv('OIDC_REALM')
34+
OIDC_REDIRECT_URI = os.getenv('REDIRECT_URI')
35+
36+
# ===== Redis Configuration =====
37+
REDIS_HOST = os.getenv('REDIS_HOST', 'localhost')
38+
REDIS_PASSWORD = os.getenv('REDIS_PASSWORD', None)
39+
REDIS_PORT = int(os.getenv('REDIS_PORT', 6379))
40+
41+
# Create a Redis connection pool
42+
redis_pool = ConnectionPool(
43+
host=REDIS_HOST,
44+
password=REDIS_PASSWORD,
45+
port=REDIS_PORT,
2946
db=0,
30-
decode_responses=True
47+
decode_responses=True,
48+
max_connections=10, # Adjust based on your application's needs
49+
socket_timeout=5.0,
50+
socket_connect_timeout=1.0,
51+
health_check_interval=30
3152
)
3253

54+
# Create a Redis client that uses the connection pool
55+
redis_client = Redis(connection_pool=redis_pool)
56+
57+
def get_redis_client():
58+
"""Get a Redis client from the connection pool"""
59+
return Redis(connection_pool=redis_pool)
60+
61+
# ===== Coder API Configuration =====
62+
CODER_API_KEY = os.getenv("CODER_API_KEY")
63+
CODER_URL = os.getenv("CODER_URL")
64+
CODER_TEMPLATE_ID = os.getenv("CODER_TEMPLATE_ID")
65+
CODER_DEFAULT_ORGANIZATION = os.getenv("CODER_DEFAULT_ORGANIZATION")
66+
CODER_WORKSPACE_NAME = os.getenv("CODER_WORKSPACE_NAME", "ubuntu")
67+
68+
# Cache for JWKS client
69+
_jwks_client = None
70+
3371
# Session management functions
3472
def get_session(session_id: str) -> Optional[Dict[str, Any]]:
3573
"""Get session data from Redis"""
36-
session_data = redis_client.get(f"session:{session_id}")
74+
client = get_redis_client()
75+
session_data = client.get(f"session:{session_id}")
3776
if session_data:
3877
return json.loads(session_data)
3978
return None
4079

4180
def set_session(session_id: str, data: Dict[str, Any], expiry: int) -> None:
4281
"""Store session data in Redis with expiry in seconds"""
43-
redis_client.setex(
82+
client = get_redis_client()
83+
client.setex(
4484
f"session:{session_id}",
4585
expiry,
4686
json.dumps(data)
4787
)
4888

4989
def delete_session(session_id: str) -> None:
5090
"""Delete session data from Redis"""
51-
redis_client.delete(f"session:{session_id}")
52-
53-
provisioning_times = {}
91+
client = get_redis_client()
92+
client.delete(f"session:{session_id}")
5493

5594
def get_auth_url() -> str:
5695
"""Generate the authentication URL for Keycloak login"""
57-
auth_url = f"{OIDC_CONFIG['server_url']}/realms/{OIDC_CONFIG['realm']}/protocol/openid-connect/auth"
96+
auth_url = f"{OIDC_SERVER_URL}/realms/{OIDC_REALM}/protocol/openid-connect/auth"
5897
params = {
59-
'client_id': OIDC_CONFIG['client_id'],
98+
'client_id': OIDC_CLIENT_ID,
6099
'response_type': 'code',
61-
'redirect_uri': OIDC_CONFIG['redirect_uri'],
100+
'redirect_uri': OIDC_REDIRECT_URI,
62101
'scope': 'openid profile email'
63102
}
64103
return f"{auth_url}?{'&'.join(f'{k}={v}' for k,v in params.items())}"
65104

66105
def get_token_url() -> str:
67106
"""Get the token endpoint URL"""
68-
return f"{OIDC_CONFIG['server_url']}/realms/{OIDC_CONFIG['realm']}/protocol/openid-connect/token"
107+
return f"{OIDC_SERVER_URL}/realms/{OIDC_REALM}/protocol/openid-connect/token"
69108

70109
def is_token_expired(token_data: Dict[str, Any], buffer_seconds: int = 30) -> bool:
71-
"""
72-
Check if the access token is expired or about to expire
73-
74-
Args:
75-
token_data: The token data containing the access token
76-
buffer_seconds: Buffer time in seconds to refresh token before it actually expires
77-
78-
Returns:
79-
bool: True if token is expired or about to expire, False otherwise
80-
"""
81110
if not token_data or 'access_token' not in token_data:
82111
return True
83112

84113
try:
85-
# Decode the JWT token without verification to get expiration time
86-
decoded = jwt.decode(token_data['access_token'], options={"verify_signature": False})
114+
# Get the signing key
115+
jwks_client = get_jwks_client()
116+
signing_key = jwks_client.get_signing_key_from_jwt(token_data['access_token'])
87117

88-
# Get expiration time from token
89-
exp_time = decoded.get('exp', 0)
118+
# Decode with verification
119+
decoded = jwt.decode(
120+
token_data['access_token'],
121+
signing_key.key,
122+
algorithms=["RS256"], # Common algorithm for OIDC
123+
audience=OIDC_CLIENT_ID,
124+
)
90125

91-
# Check if token is expired or about to expire (with buffer)
126+
# Check expiration
127+
exp_time = decoded.get('exp', 0)
92128
current_time = time.time()
93129
return current_time + buffer_seconds >= exp_time
130+
except jwt.ExpiredSignatureError:
131+
return True
94132
except Exception as e:
95133
print(f"Error checking token expiration: {str(e)}")
96134
return True
@@ -115,8 +153,8 @@ async def refresh_token(session_id: str, token_data: Dict[str, Any]) -> Tuple[bo
115153
get_token_url(),
116154
data={
117155
'grant_type': 'refresh_token',
118-
'client_id': OIDC_CONFIG['client_id'],
119-
'client_secret': OIDC_CONFIG['client_secret'],
156+
'client_id': OIDC_CLIENT_ID,
157+
'client_secret': OIDC_CLIENT_SECRET,
120158
'refresh_token': token_data['refresh_token']
121159
}
122160
)
@@ -136,3 +174,11 @@ async def refresh_token(session_id: str, token_data: Dict[str, Any]) -> Tuple[bo
136174
except Exception as e:
137175
print(f"Error refreshing token: {str(e)}")
138176
return False, token_data
177+
178+
def get_jwks_client():
179+
"""Get or create a PyJWKClient for token verification"""
180+
global _jwks_client
181+
if _jwks_client is None:
182+
jwks_url = f"{OIDC_SERVER_URL}/realms/{OIDC_REALM}/protocol/openid-connect/certs"
183+
_jwks_client = PyJWKClient(jwks_url)
184+
return _jwks_client

src/backend/database/__init__.py

Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,31 @@
1+
"""
2+
Database module for the application.
3+
4+
This module provides access to all database components used in the application.
5+
"""
6+
7+
from .database import (
8+
init_db,
9+
get_session,
10+
get_user_repository,
11+
get_pad_repository,
12+
get_backup_repository,
13+
get_template_pad_repository,
14+
get_user_service,
15+
get_pad_service,
16+
get_backup_service,
17+
get_template_pad_service
18+
)
19+
20+
__all__ = [
21+
'init_db',
22+
'get_session',
23+
'get_user_repository',
24+
'get_pad_repository',
25+
'get_backup_repository',
26+
'get_template_pad_repository',
27+
'get_user_service',
28+
'get_pad_service',
29+
'get_backup_service',
30+
'get_template_pad_service',
31+
]

0 commit comments

Comments
 (0)