-
Notifications
You must be signed in to change notification settings - Fork 106
Add InfluxDB integration for enhanced historical data retrieval #586
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add InfluxDB integration for enhanced historical data retrieval #586
Conversation
This feature enables EMHASS to use InfluxDB as an alternative data source to Home Assistant API, providing access to longer historical data retention for better machine learning model training. ## New Features: - **InfluxDB data retrieval**: Complete integration with InfluxDB v1.x - **Automatic data source selection**: Simple config switch between HA API and InfluxDB - **Enhanced ML model training**: Access to months/years of historical data vs. HA's limited retention - **Improved performance**: Faster queries for large datasets ## Configuration: - Added 8 new configuration parameters for InfluxDB connection - Backward compatible: InfluxDB disabled by default - Simple activation: `"use_influxdb": true` in configuration ## Implementation: - New `get_data_influxdb()` method in `RetrieveHass` class - Transparent integration: existing code unchanged - Robust error handling and connection management - Data format compatibility with existing HA API data ## Testing: - Tested with default InfluxDB 1.x Home Assistant add-on - Perfect correlation (0.999985) between InfluxDB and HA API data - Successfully integrates with existing EMHASS workflows **Looking for testers and feedback!** Please test with your InfluxDB setup and report any issues or suggestions for improvement. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
Reviewer's GuideIntroduce optional InfluxDB integration in RetrieveHass via a new configuration switch, transparently falling back to the Home Assistant API and providing robust connection management, query execution, and DataFrame assembly for extended historical data retrieval. Sequence diagram for data retrieval with InfluxDB integrationsequenceDiagram
participant "RetrieveHass"
participant "InfluxDBClient"
participant "Logger"
actor "User"
"User"->>"RetrieveHass": get_data(days_list, var_list)
alt use_influxdb is True
"RetrieveHass"->>"Logger": info("Retrieve InfluxDB get data method initiated...")
"RetrieveHass"->>"InfluxDBClient": connect(host, port, ...)
"InfluxDBClient"->>"RetrieveHass": ping()
loop for each sensor in var_list
"RetrieveHass"->>"InfluxDBClient": query(query)
"InfluxDBClient"-->>"RetrieveHass": result
"RetrieveHass"->>"Logger": info("Retrieved N data points for sensor")
end
"RetrieveHass"->>"InfluxDBClient": close()
"RetrieveHass"->>"Logger": info("InfluxDB data retrieval completed")
else use_influxdb is False
"RetrieveHass"->>"Logger": debug("InfluxDB integration disabled, using Home Assistant API")
"RetrieveHass"->>"Home Assistant API": get data
end
Class diagram for RetrieveHass with InfluxDB integrationclassDiagram
class RetrieveHass {
- use_influxdb: bool
- influxdb_host: str
- influxdb_port: int
- influxdb_username: str
- influxdb_password: str
- influxdb_database: str
- influxdb_measurement: str
- influxdb_retention_policy: str
+ get_data(days_list, var_list): bool
+ get_data_influxdb(days_list, var_list): bool
+ get_ha_config()
+ prepare_data(var_load, ...)
}
RetrieveHass <|-- InfluxDBClient : uses
class InfluxDBClient {
+ ping()
+ query(query)
+ close()
}
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey there - I've reviewed your changes and they look great!
Prompt for AI Agents
Please address the comments from this code review:
## Individual Comments
### Comment 1
<location> `src/emhass/retrieve_hass.py:369-371` </location>
<code_context>
+ return False
+
+ # Calculate total time range
+ start_time = days_list[0]
+ end_time = days_list[-1] + pd.Timedelta(days=1)
+ total_days = (end_time - start_time).days
+
+ self.logger.info(f"Retrieving {len(var_list)} sensors over {total_days} days from InfluxDB")
</code_context>
<issue_to_address>
**issue:** Time range calculation may not handle empty days_list gracefully.
Accessing days_list[0] and days_list[-1] will cause an IndexError if days_list is empty. Please add a check to handle this case.
</issue_to_address>
### Comment 2
<location> `src/emhass/retrieve_hass.py:392-399` </location>
<code_context>
+ start_time_str = start_time.strftime('%Y-%m-%dT%H:%M:%SZ')
+ end_time_str = end_time.strftime('%Y-%m-%dT%H:%M:%SZ')
+
+ query = f'''
+ SELECT mean("value") AS "mean_value"
+ FROM "{self.influxdb_database}"."{self.influxdb_retention_policy}"."{self.influxdb_measurement}"
+ WHERE time >= '{start_time_str}'
+ AND time < '{end_time_str}'
+ AND "entity_id"='{entity_id}'
+ GROUP BY time({interval}) FILL(linear)
+ '''
+
+ self.logger.debug(f"InfluxDB query: {query}")
</code_context>
<issue_to_address>
**issue (bug_risk):** InfluxQL query uses FILL(linear), which may not be supported in all InfluxDB versions.
FILL(linear) is only available in InfluxDB Enterprise. If targeting open-source InfluxDB, this query will fail. Please make the fill method configurable or provide a fallback to FILL(previous) or FILL(none) when linear is unsupported.
</issue_to_address>
### Comment 3
<location> `src/emhass/retrieve_hass.py:418-420` </location>
<code_context>
+ df_sensor = pd.DataFrame(points)
+
+ # Convert time column and set as index
+ df_sensor['time'] = pd.to_datetime(df_sensor['time'])
+ df_sensor.set_index('time', inplace=True)
+
+ # Rename value column to original sensor name
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Time conversion does not specify timezone awareness.
If timestamps are not in UTC or lack timezone info, this may cause alignment issues. Please ensure timezone is set or validated.
```suggestion
# Convert time column and set as index, ensuring timezone awareness (UTC)
df_sensor['time'] = pd.to_datetime(df_sensor['time'], utc=True)
df_sensor.set_index('time', inplace=True)
```
</issue_to_address>
### Comment 4
<location> `src/emhass/retrieve_hass.py:429-430` </location>
<code_context>
+ continue
+
+ # Handle non-numeric data (same as HA processing)
+ df_sensor[sensor] = pd.to_numeric(df_sensor[sensor], errors='coerce')
+
+ # Create time index for first sensor
</code_context>
<issue_to_address>
**suggestion:** Non-numeric values are coerced to NaN without further handling.
If many values are non-numeric, consider logging a warning or adding explicit handling for cases with a high proportion of NaNs.
```suggestion
# Handle non-numeric data (same as HA processing)
df_sensor[sensor] = pd.to_numeric(df_sensor[sensor], errors='coerce')
# Check proportion of NaNs and log warning if high
nan_count = df_sensor[sensor].isna().sum()
total_count = len(df_sensor[sensor])
if total_count > 0:
nan_ratio = nan_count / total_count
if nan_ratio > 0.2:
self.logger.warning(
f"Sensor '{sensor}' has {nan_count}/{total_count} ({nan_ratio:.1%}) non-numeric values coerced to NaN."
)
```
</issue_to_address>
### Comment 5
<location> `src/emhass/retrieve_hass.py:435-441` </location>
<code_context>
+ # Create time index for first sensor
+ if i == 0:
+ # Create complete time range with specified frequency
+ ts = pd.date_range(
+ start=df_sensor.index.min(),
+ end=df_sensor.index.max(),
+ freq=self.freq
+ )
+ df_complete = pd.DataFrame(index=ts)
+ df_complete = pd.concat([df_complete, df_sensor], axis=1)
+ self.df_final = df_complete
+ else:
</code_context>
<issue_to_address>
**suggestion:** Time range for the first sensor is based on its data only.
Using only the first sensor's time range may exclude data from other sensors. Please align the time index to cover the full range across all sensors.
Suggested implementation:
```python
# Collect min and max timestamps for each sensor
if i == 0:
global_min = df_sensor.index.min()
global_max = df_sensor.index.max()
else:
global_min = min(global_min, df_sensor.index.min())
global_max = max(global_max, df_sensor.index.max())
```
```python
# After processing all sensors, create the complete time index
ts = pd.date_range(
start=global_min,
end=global_max,
freq=self.freq
)
df_complete = pd.DataFrame(index=ts)
# You may need to concatenate all sensor dataframes here as needed
self.df_final = df_complete
self.logger.debug("InfluxDB integration disabled, using Home Assistant API")
```
You will need to:
1. Move the creation of the complete time index and DataFrame (`df_complete`) outside the sensor loop, after all sensors have been processed.
2. Concatenate all sensor dataframes to `df_complete` as needed, rather than just the first sensor.
3. Ensure that `global_min` and `global_max` are initialized before the loop and updated for each sensor.
</issue_to_address>
### Comment 6
<location> `src/emhass/retrieve_hass.py:457-462` </location>
<code_context>
+ return False
+
+ # Set frequency and validate
+ self.df_final = set_df_index_freq(self.df_final)
+ if self.df_final.index.freq != self.freq:
+ self.logger.warning(
</code_context>
<issue_to_address>
**suggestion (bug_risk):** set_df_index_freq is called without error handling.
Wrap set_df_index_freq in a try/except block and log exceptions to improve error handling.
```suggestion
# Set frequency and validate
try:
self.df_final = set_df_index_freq(self.df_final)
except Exception as e:
self.logger.error(f"Exception occurred while setting DataFrame index frequency: {e}")
return False
if self.df_final.index.freq != self.freq:
self.logger.warning(
f"InfluxDB data frequency ({self.df_final.index.freq}) differs from expected ({self.freq})"
)
```
</issue_to_address>
### Comment 7
<location> `src/emhass/retrieve_hass.py:323` </location>
<code_context>
self.var_list = var_list
return True
+ def get_data_influxdb(
+ self,
+ days_list: pd.date_range,
</code_context>
<issue_to_address>
**issue (complexity):** Consider refactoring get_data_influxdb into a short orchestration loop with focused helper methods for connection, query building, data fetching, and merging.
Here’s one way to collapse the 160-line `get_data_influxdb` into a thin loop + a few focused helpers. Everything stays exactly the same under the hood, but each block is now testable and reads in 3–5 lines:
```python
def get_data_influxdb(self, days_list, var_list) -> bool:
client = self._init_influx_client()
if not client:
return False
start, end = days_list[0], days_list[-1] + pd.Timedelta(days=1)
dfs = []
for sensor in filter(None, var_list):
df = self._fetch_sensor_df(client, sensor, start, end)
if df is not None:
dfs.append(df)
client.close()
if not dfs:
self.logger.error("No data retrieved from InfluxDB")
return False
self.df_final = self._merge_dfs(dfs)
self.var_list = var_list
self.logger.info(f"InfluxDB data retrieval completed: {self.df_final.shape}")
return True
```
Below are the helpers—you can fold these into your class:
```python
def _init_influx_client(self):
try:
from influxdb import InfluxDBClient
client = InfluxDBClient(
host=self.influxdb_host,
port=self.influxdb_port,
username=self.influxdb_username or None,
password=self.influxdb_password or None,
database=self.influxdb_database,
)
client.ping()
self.logger.debug(f"Connected to InfluxDB @ {self.influxdb_host}:{self.influxdb_port}")
return client
except ImportError:
self.logger.error("pip install influxdb")
except Exception as e:
self.logger.error(f"InfluxDB connection failed: {e}")
return None
```
```python
def _build_query(self, sensor, start, end):
entity = sensor.removeprefix("sensor.")
interval = f"{int(self.freq.total_seconds()/60)}m"
st, et = start.strftime("%Y-%m-%dT%H:%M:%SZ"), end.strftime("%Y-%m-%dT%H:%M:%SZ")
return f'''
SELECT mean("value") AS "mean_value"
FROM "{self.influxdb_database}"."{self.influxdb_retention_policy}"."{self.influxdb_measurement}"
WHERE time >= '{st}' AND time < '{et}' AND "entity_id"='{entity}'
GROUP BY time({interval}) FILL(linear)
'''
```
```python
def _fetch_sensor_df(self, client, sensor, start, end):
try:
query = self._build_query(sensor, start, end)
points = list(client.query(query).get_points())
if not points:
self.logger.warning(f"No data for {sensor}")
return None
df = pd.DataFrame(points)
df['time'] = pd.to_datetime(df['time'])
df.set_index('time', inplace=True)
df = df.rename(columns={'mean_value': sensor})
df[sensor] = pd.to_numeric(df[sensor], errors='coerce')
# ensure a full index on the first sensor
if not hasattr(self, "_base_index"):
idx = pd.date_range(df.index.min(), df.index.max(), freq=self.freq)
self._base_index = pd.DataFrame(index=idx)
return pd.concat([self._base_index, df], axis=1)
except Exception as e:
self.logger.error(f"Query failed for {sensor}: {e}")
return None
```
```python
def _merge_dfs(self, dfs):
df = pd.concat(dfs, axis=1)
df = set_df_index_freq(df)
if df.index.freq != self.freq:
self.logger.warning(
f"Freq mismatch: got {df.index.freq}, expected {self.freq}"
)
return df
```
Benefits:
- `get_data_influxdb` is now a 10-line orchestration loop
- Each helper does one job: connect, build query, fetch/normalize, merge
- Easier to test/fix individual pieces without wading through the whole method
</issue_to_address>
### Comment 8
<location> `src/emhass/retrieve_hass.py:357` </location>
<code_context>
username=self.influxdb_username if self.influxdb_username else None,
</code_context>
<issue_to_address>
**suggestion (code-quality):** Replace if-expression with `or` ([`or-if-exp-identity`](https://docs.sourcery.ai/Reference/Rules-and-In-Line-Suggestions/Python/Default-Rules/or-if-exp-identity))
```suggestion
username=self.influxdb_username or None,
```
<br/><details><summary>Explanation</summary>Here we find ourselves setting a value if it evaluates to `True`, and otherwise
using a default.
The 'After' case is a bit easier to read and avoids the duplication of
`input_currency`.
It works because the left-hand side is evaluated first. If it evaluates to
true then `currency` will be set to this and the right-hand side will not be
evaluated. If it evaluates to false the right-hand side will be evaluated and
`currency` will be set to `DEFAULT_CURRENCY`.
</details>
</issue_to_address>
### Comment 9
<location> `src/emhass/retrieve_hass.py:358` </location>
<code_context>
password=self.influxdb_password if self.influxdb_password else None,
</code_context>
<issue_to_address>
**suggestion (code-quality):** Replace if-expression with `or` ([`or-if-exp-identity`](https://docs.sourcery.ai/Reference/Rules-and-In-Line-Suggestions/Python/Default-Rules/or-if-exp-identity))
```suggestion
password=self.influxdb_password or None,
```
<br/><details><summary>Explanation</summary>Here we find ourselves setting a value if it evaluates to `True`, and otherwise
using a default.
The 'After' case is a bit easier to read and avoids the duplication of
`input_currency`.
It works because the left-hand side is evaluated first. If it evaluates to
true then `currency` will be set to this and the right-hand side will not be
evaluated. If it evaluates to false the right-hand side will be evaluated and
`currency` will be set to `DEFAULT_CURRENCY`.
</details>
</issue_to_address>
### Comment 10
<location> `src/emhass/retrieve_hass.py:323` </location>
<code_context>
def get_data_influxdb(
self,
days_list: pd.date_range,
var_list: list,
) -> bool:
"""
Retrieve data from InfluxDB database.
This method provides an alternative data source to Home Assistant API,
enabling longer historical data retention for better machine learning model training.
:param days_list: A list of days to retrieve data for
:type days_list: pandas.date_range
:param var_list: List of sensor entity IDs to retrieve
:type var_list: list
:return: Success status of data retrieval
:rtype: bool
"""
self.logger.info("Retrieve InfluxDB get data method initiated...")
try:
from influxdb import InfluxDBClient
except ImportError:
self.logger.error("InfluxDB client not installed. Install with: pip install influxdb")
return False
# Remove empty strings from var_list
var_list = [var for var in var_list if var != ""]
# Connect to InfluxDB
try:
client = InfluxDBClient(
host=self.influxdb_host,
port=self.influxdb_port,
username=self.influxdb_username if self.influxdb_username else None,
password=self.influxdb_password if self.influxdb_password else None,
database=self.influxdb_database
)
# Test connection
client.ping()
self.logger.debug(f"Successfully connected to InfluxDB at {self.influxdb_host}:{self.influxdb_port}")
except Exception as e:
self.logger.error(f"Failed to connect to InfluxDB: {e}")
return False
# Calculate total time range
start_time = days_list[0]
end_time = days_list[-1] + pd.Timedelta(days=1)
total_days = (end_time - start_time).days
self.logger.info(f"Retrieving {len(var_list)} sensors over {total_days} days from InfluxDB")
self.logger.debug(f"Time range: {start_time} to {end_time}")
self.df_final = pd.DataFrame()
for i, sensor in enumerate(var_list):
self.logger.debug(f"Retrieving sensor {i+1}/{len(var_list)}: {sensor}")
# Convert sensor name: sensor.sec_pac_solar -> sec_pac_solar
entity_id = sensor.replace('sensor.', '') if sensor.startswith('sensor.') else sensor
# Convert frequency to InfluxDB interval
freq_minutes = int(self.freq.total_seconds() / 60)
interval = f"{freq_minutes}m"
# Build InfluxQL query - format times properly for InfluxDB
start_time_str = start_time.strftime('%Y-%m-%dT%H:%M:%SZ')
end_time_str = end_time.strftime('%Y-%m-%dT%H:%M:%SZ')
query = f'''
SELECT mean("value") AS "mean_value"
FROM "{self.influxdb_database}"."{self.influxdb_retention_policy}"."{self.influxdb_measurement}"
WHERE time >= '{start_time_str}'
AND time < '{end_time_str}'
AND "entity_id"='{entity_id}'
GROUP BY time({interval}) FILL(linear)
'''
self.logger.debug(f"InfluxDB query: {query}")
try:
# Execute query
result = client.query(query)
# Convert result to points
points = list(result.get_points())
if not points:
self.logger.warning(f"No data found for sensor: {sensor}")
continue
self.logger.info(f"Retrieved {len(points)} data points for {sensor}")
# Create DataFrame from points
df_sensor = pd.DataFrame(points)
# Convert time column and set as index
df_sensor['time'] = pd.to_datetime(df_sensor['time'])
df_sensor.set_index('time', inplace=True)
# Rename value column to original sensor name
if 'mean_value' in df_sensor.columns:
df_sensor = df_sensor[['mean_value']].rename(columns={'mean_value': sensor})
else:
self.logger.error(f"Expected 'mean_value' column not found for {sensor}")
continue
# Handle non-numeric data (same as HA processing)
df_sensor[sensor] = pd.to_numeric(df_sensor[sensor], errors='coerce')
# Create time index for first sensor
if i == 0:
# Create complete time range with specified frequency
ts = pd.date_range(
start=df_sensor.index.min(),
end=df_sensor.index.max(),
freq=self.freq
)
df_complete = pd.DataFrame(index=ts)
df_complete = pd.concat([df_complete, df_sensor], axis=1)
self.df_final = df_complete
else:
# Add to existing dataframe
self.df_final = pd.concat([self.df_final, df_sensor], axis=1)
except Exception as e:
self.logger.error(f"Failed to query sensor {sensor}: {e}")
continue
client.close()
if self.df_final.empty:
self.logger.error("No data retrieved from InfluxDB")
return False
# Set frequency and validate
self.df_final = set_df_index_freq(self.df_final)
if self.df_final.index.freq != self.freq:
self.logger.warning(
f"InfluxDB data frequency ({self.df_final.index.freq}) differs from expected ({self.freq})"
)
self.var_list = var_list
self.logger.info(f"InfluxDB data retrieval completed: {self.df_final.shape}")
return True
</code_context>
<issue_to_address>
**issue (code-quality):** Low code quality found in RetrieveHass.get\_data\_influxdb - 19% ([`low-code-quality`](https://docs.sourcery.ai/Reference/Default-Rules/comments/low-code-quality/))
<br/><details><summary>Explanation</summary>The quality score for this function is below the quality threshold of 25%.
This score is a combination of the method length, cognitive complexity and working memory.
How can you solve this?
It might be worth refactoring this function to make it shorter and more readable.
- Reduce the function length by extracting pieces of functionality out into
their own functions. This is the most important thing you can do - ideally a
function should be less than 10 lines.
- Reduce nesting, perhaps by introducing guard clauses to return early.
- Ensure that variables are tightly scoped, so that code using related concepts
sits together within the function rather than being scattered.</details>
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
|
This is a great feature. |
- Fix empty days_list handling to prevent IndexError - Change FILL(linear) to FILL(previous) for open-source InfluxDB compatibility - Add timezone awareness with utc=True parameter to pd.to_datetime() - Add NaN ratio warnings when >20% of sensor data is non-numeric - Fix time index to span globally across all sensors instead of just first sensor - Add comprehensive error handling for set_df_index_freq() method - Refactor large get_data_influxdb() method into focused helper functions: * _init_influx_client() for connection setup * _build_influx_query() for InfluxQL query construction * _fetch_sensor_data() for individual sensor data retrieval - Replace if-expressions with cleaner 'or' operators for default values 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
|
## Why Auto-Discovery is Essential This commit adds intelligent auto-discovery that automatically detects which InfluxDB measurement contains any given Home Assistant entity, regardless of its unit of measurement (W, EUR/kWh, %, A, V, etc.). ### Problem Solved Before this enhancement, users had to manually configure the correct measurement for each sensor type. This was problematic because: 1. **Entity Diversity**: HA entities are stored in different measurements based on their units: - Power sensors → "W" measurement - Price sensors → "EUR/kWh" measurement - Percentage sensors → "%" measurement - Current sensors → "A" measurement 2. **User Complexity**: Users would need to know InfluxDB's internal structure 3. **Configuration Errors**: Wrong measurement selection caused data retrieval failures 4. **Limited Flexibility**: Adding new sensor types required manual configuration ### Solution: Intelligent Auto-Discovery The auto-discovery system uses `SHOW TAG VALUES` queries to: 1. **Scan All Measurements**: Checks priority measurements first (EUR/kWh, W, %, etc.) 2. **Find Entity Location**: Locates the exact measurement containing the target entity 3. **Cache Results**: Stores discovered mappings for performance 4. **Fallback Gracefully**: Uses default measurement if entity not found ### Technical Implementation - Uses `SHOW TAG VALUES FROM "measurement" WITH KEY = "entity_id"` - Implements measurement caching for performance optimization - Provides comprehensive error handling and debug logging - Maintains backward compatibility with manual measurement specification ### Real-World Impact This enables seamless integration with diverse sensor types: - ✅ `sensor.sec_prices_afname` → Auto-detected in "EUR/kWh" - ✅ `sensor.sec_pac_solar` → Auto-detected in "W" - ✅ `sensor.battery_level` → Auto-detected in "%" - ✅ Works with any HA entity regardless of unit ### Additional Improvements - Fixed parameter definitions to use `"input": "secrets.string"` for credentials - Enhanced error handling for empty datasets - Added comprehensive debug logging for troubleshooting - Improved InfluxDB query reliability 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
## Problem
The `unit_load_cost` and `unit_prod_price` sensors were displaying only
2 decimal places in Home Assistant (e.g., 0.13) even though internally
EMHASS calculated them with 4 decimals (e.g., 0.1291).
This caused precision loss for price-sensitive optimizations where small
differences in electricity prices matter.
## Root Cause
The `get_attr_data_dict()` function hardcoded 2 decimal places:
- Line 716: `vals_list = [str(np.round(i, 2)) ...]`
- Line 724: `"state": f"{state:.2f}"`
## Solution
Added configurable `decimals` parameter to `get_attr_data_dict()`:
- Default: 2 decimals (maintains backward compatibility)
- Price sensors: 4 decimals for `unit_load_cost` and `unit_prod_price`
- Affects both the state value and forecast attribute lists
## Testing
Before: sensor.mpc_unit_load_cost = 0.13 (2 decimals)
After: sensor.mpc_unit_load_cost = 0.1291 (4 decimals)
Logs show: "Successfully posted to sensor.mpc_unit_load_cost = 0.1291"
HA displays: 0.1291
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
|
Hi @scruysberghs. I've added the unit test for the InfluxDB method and pushed it to a branch on my fork. |
|
done (I think). |
|
I cannot disable influxdb on use_influxdb: false does nothing (addon test) |
strange, not breaking stuff for people not using influxdb was actually the first thing I tested ... I'll have a look |
|
@davidusb-geek , make sure not to release until this is cleared up. Don't have much time right now but it has something to do with |
@tbrasser I can't reproduce here in my dev container. With use_influxdb set to false or empty or "false" it used hass retrieve (ha api calls) if I set the use_influxdb to true and fill in the other parameters it retrieves dta from influxdb. |
config was what I used before, the influxdb related entries got automatically added including use_influxdb: false |
|
I switched my production machine over to emhass test but I can't reproduce. In the config.json file I used from production the default influxdb paramters were added automaticly: and way at the bottom:
But everything works as before. Tried a ML forecast model fit and my MPC Optimiztions work fine still using homeassistant api calls. Even tried disabling my influxdb server alltogether just to make sure. If I fill in my influxdb parameters with the influxdb server off I get: After starting influxdb server addon: @tbrasser Can you post the runtime parameters you are using for that call you are running every 5 minutes? No influxdb related runtime parameters in there? @davidusb-geek are you running test version? Maybe move this conversation over to "Issues" in case other people are affected (no clue as to how many people run emhass test add-on..?). I have not been able to reproduce |
|
I've since reverted to the non-test addon. I can try again maybe tonight. my naive mpc call sends: |
|
I just tried to test the influxdb retrieval but always get an error: [2025-10-23 19:38:28 +0000] [11] [INFO] InfluxDB integration enabled: https://influxdb.local.<...>.de:8086/home_assistant I have influxdb v1 setup as an LXC container in Proxmox:
According to the documentation, you have to use the /query endpoint: https://docs.influxdata.com/influxdb/v1/administration/authentication_and_authorization/#authenticate-with-the-influxdb-api Does this PR only work with influxdb v2? |
|
forget my last comment, I added "https" to the host, which results in the 404 error.
According to the documentation the parameter "ssl=True" must be used for this setup (https://influxdb-python.readthedocs.io/en/latest/api-documentation.html). I tested it locally and now I get a connection. |
|
@tbrasser The "use_influxdb": false, bug was found (but not merged yet)
your runtime parameters had me take another look at the code, looks like I missed the memo on this. Documentation still only speaks of csv file in data folder or a list of the next values as a runtime parameter. It even annoyed me enough to have Claude code make a PR with some documentation for this #603 |
yeah it's amazing, found it when wanting to prepare for 15min grid pricing. Now have 4 day 15min horizon which solves so many issues I had |
|
Will there be a influxdb 2 support as well? |
Plan is to stick to 1.x for now (=default version if you install influxdb as HAOS addon) |
ok, then i made a huge mistake setting up influxdb 2 enviroment. Is it so much harder to integrate to EMHASS? |
It needs another python library and other connection paramters, so it is not just flipping a switch. Feel free to have to look |




As discussed a while back (#490), this feature enables EMHASS to use InfluxDB as an alternative data source to Home Assistant API (or even other EMS systems like evcc), providing access to longer historical data retention for better machine learning model training. Just spent some Claude Code tokens on this but seems to work fine so far.
New Features:
Configuration:
"use_influxdb": truein configurationImplementation:
get_data_influxdb()method inRetrieveHassclassTesting:
Looking for testers and feedback! Please test with your InfluxDB setup and report any issues or suggestions for improvement.
🤖 Generated with Claude Code
Summary by Sourcery
Add optional InfluxDB integration in RetrieveHass to fetch long-term historical data and seamlessly switch between Home Assistant API and InfluxDB based on user configuration
New Features:
Enhancements: