Add InfluxDB integration for enhanced historical data retrieval #586

scruysberghs · 2025-09-25T20:42:49Z

As discussed a while back (#490), this feature enables EMHASS to use InfluxDB as an alternative data source to Home Assistant API (or even other EMS systems like evcc), providing access to longer historical data retention for better machine learning model training. Just spent some Claude Code tokens on this but seems to work fine so far.

New Features:

InfluxDB data retrieval: Complete integration with InfluxDB v1.x
Automatic data source selection: Simple config switch between HA API and InfluxDB
Enhanced ML model training: Access to months/years of historical data vs. HA's limited retention
Improved performance: Faster queries for large datasets

Configuration:

Added 8 new configuration parameters for InfluxDB connection
Backward compatible: InfluxDB disabled by default
Simple activation: "use_influxdb": true in configuration

Implementation:

New get_data_influxdb() method in RetrieveHass class
Transparent integration: existing code unchanged
Robust error handling and connection management
Data format compatibility with existing HA API data

Testing:

Tested with default InfluxDB 1.x Home Assistant add-on
Perfect correlation (0.999985) between InfluxDB and HA API data
Successfully integrates with existing EMHASS workflows

Looking for testers and feedback! Please test with your InfluxDB setup and report any issues or suggestions for improvement.

🤖 Generated with Claude Code

Summary by Sourcery

Add optional InfluxDB integration in RetrieveHass to fetch long-term historical data and seamlessly switch between Home Assistant API and InfluxDB based on user configuration

New Features:

Integrate InfluxDB v1.x as an optional data source for retrieving historical sensor data
Add configuration switch and parameters to toggle between Home Assistant API and InfluxDB

Enhancements:

Automatically select data source based on configuration
Support extended retention (months/years) for improved ML model training
Optimize query performance and enrich logging for InfluxDB operations

This feature enables EMHASS to use InfluxDB as an alternative data source to Home Assistant API, providing access to longer historical data retention for better machine learning model training. ## New Features: - **InfluxDB data retrieval**: Complete integration with InfluxDB v1.x - **Automatic data source selection**: Simple config switch between HA API and InfluxDB - **Enhanced ML model training**: Access to months/years of historical data vs. HA's limited retention - **Improved performance**: Faster queries for large datasets ## Configuration: - Added 8 new configuration parameters for InfluxDB connection - Backward compatible: InfluxDB disabled by default - Simple activation: `"use_influxdb": true` in configuration ## Implementation: - New `get_data_influxdb()` method in `RetrieveHass` class - Transparent integration: existing code unchanged - Robust error handling and connection management - Data format compatibility with existing HA API data ## Testing: - Tested with default InfluxDB 1.x Home Assistant add-on - Perfect correlation (0.999985) between InfluxDB and HA API data - Successfully integrates with existing EMHASS workflows **Looking for testers and feedback!** Please test with your InfluxDB setup and report any issues or suggestions for improvement. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>

sourcery-ai · 2025-09-25T20:42:55Z

Reviewer's Guide

Introduce optional InfluxDB integration in RetrieveHass via a new configuration switch, transparently falling back to the Home Assistant API and providing robust connection management, query execution, and DataFrame assembly for extended historical data retrieval.

Sequence diagram for data retrieval with InfluxDB integration

sequenceDiagram
    participant "RetrieveHass"
    participant "InfluxDBClient"
    participant "Logger"
    actor "User"
    "User"->>"RetrieveHass": get_data(days_list, var_list)
    alt use_influxdb is True
        "RetrieveHass"->>"Logger": info("Retrieve InfluxDB get data method initiated...")
        "RetrieveHass"->>"InfluxDBClient": connect(host, port, ...)
        "InfluxDBClient"->>"RetrieveHass": ping()
        loop for each sensor in var_list
            "RetrieveHass"->>"InfluxDBClient": query(query)
            "InfluxDBClient"-->>"RetrieveHass": result
            "RetrieveHass"->>"Logger": info("Retrieved N data points for sensor")
        end
        "RetrieveHass"->>"InfluxDBClient": close()
        "RetrieveHass"->>"Logger": info("InfluxDB data retrieval completed")
    else use_influxdb is False
        "RetrieveHass"->>"Logger": debug("InfluxDB integration disabled, using Home Assistant API")
        "RetrieveHass"->>"Home Assistant API": get data
    end

Class diagram for RetrieveHass with InfluxDB integration

classDiagram
    class RetrieveHass {
        - use_influxdb: bool
        - influxdb_host: str
        - influxdb_port: int
        - influxdb_username: str
        - influxdb_password: str
        - influxdb_database: str
        - influxdb_measurement: str
        - influxdb_retention_policy: str
        + get_data(days_list, var_list): bool
        + get_data_influxdb(days_list, var_list): bool
        + get_ha_config()
        + prepare_data(var_load, ...)
    }
    RetrieveHass <|-- InfluxDBClient : uses
    class InfluxDBClient {
        + ping()
        + query(query)
        + close()
    }

File-Level Changes

Change	Details	Files
Add InfluxDB configuration and initialization	Introduce use_influxdb flag defaulting to false Load host, port, username, password, database, measurement, and retention policy from params Log activation or fallback based on use_influxdb setting	`src/emhass/retrieve_hass.py`
Implement dynamic data source selection	Branch get_data() to call get_data_influxdb when use_influxdb is true Preserve existing Home Assistant API path as fallback	`src/emhass/retrieve_hass.py`
Add get_data_influxdb method for historical data retrieval	Import and connect to InfluxDBClient with error handling Iterate over sensors to build and execute InfluxQL queries with mean aggregation Assemble results into a unified pandas DataFrame with proper indexing and frequency alignment Handle missing data, convert types, and close the client connection	`src/emhass/retrieve_hass.py`
Extend configuration defaults and parameter definitions for InfluxDB	Add eight new InfluxDB connection parameters with default values Ensure backward compatibility by disabling InfluxDB integration by default Document new parameters in param_definitions.json and config_defaults.json	`src/emhass/data/config_defaults.json` `src/emhass/static/data/param_definitions.json`

Tips and commands

Interacting with Sourcery

Trigger a new review: Comment @sourcery-ai review on the pull request.
Continue discussions: Reply directly to Sourcery's review comments.
Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it. You can also reply to a
review comment with @sourcery-ai issue to create an issue from it.
Generate a pull request title: Write @sourcery-ai anywhere in the pull
request title to generate a title at any time. You can also comment
@sourcery-ai title on the pull request to (re-)generate the title at any time.
Generate a pull request summary: Write @sourcery-ai summary anywhere in
the pull request body to generate a PR summary at any time exactly where you
want it. You can also comment @sourcery-ai summary on the pull request to
(re-)generate the summary at any time.
Generate reviewer's guide: Comment @sourcery-ai guide on the pull
request to (re-)generate the reviewer's guide at any time.
Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
pull request to resolve all Sourcery comments. Useful if you've already
addressed all the comments and don't want to see them anymore.
Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
request to dismiss all existing Sourcery reviews. Especially useful if you
want to start fresh with a new review - don't forget to comment
@sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

Enable or disable review features such as the Sourcery-generated pull request
summary, the reviewer's guide, and others.
Change the review language.
Add, remove or edit custom review instructions.
Adjust other review settings.

Getting Help

Contact our support team for questions or feedback.
Visit our documentation for detailed guides and information.
Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

sourcery-ai

Hey there - I've reviewed your changes and they look great!

Prompt for AI Agents

Please address the comments from this code review:

## Individual Comments

### Comment 1
<location> `src/emhass/retrieve_hass.py:369-371` </location>
<code_context>
+            return False
+
+        # Calculate total time range
+        start_time = days_list[0]
+        end_time = days_list[-1] + pd.Timedelta(days=1)
+        total_days = (end_time - start_time).days
+
+        self.logger.info(f"Retrieving {len(var_list)} sensors over {total_days} days from InfluxDB")
</code_context>

<issue_to_address>
**issue:** Time range calculation may not handle empty days_list gracefully.

Accessing days_list[0] and days_list[-1] will cause an IndexError if days_list is empty. Please add a check to handle this case.
</issue_to_address>

### Comment 2
<location> `src/emhass/retrieve_hass.py:392-399` </location>
<code_context>
+            start_time_str = start_time.strftime('%Y-%m-%dT%H:%M:%SZ')
+            end_time_str = end_time.strftime('%Y-%m-%dT%H:%M:%SZ')
+
+            query = f'''
+            SELECT mean("value") AS "mean_value"
+            FROM "{self.influxdb_database}"."{self.influxdb_retention_policy}"."{self.influxdb_measurement}"
+            WHERE time >= '{start_time_str}'
+            AND time < '{end_time_str}'
+            AND "entity_id"='{entity_id}'
+            GROUP BY time({interval}) FILL(linear)
+            '''
+
+            self.logger.debug(f"InfluxDB query: {query}")
</code_context>

<issue_to_address>
**issue (bug_risk):** InfluxQL query uses FILL(linear), which may not be supported in all InfluxDB versions.

FILL(linear) is only available in InfluxDB Enterprise. If targeting open-source InfluxDB, this query will fail. Please make the fill method configurable or provide a fallback to FILL(previous) or FILL(none) when linear is unsupported.
</issue_to_address>

### Comment 3
<location> `src/emhass/retrieve_hass.py:418-420` </location>
<code_context>
+                df_sensor = pd.DataFrame(points)
+
+                # Convert time column and set as index
+                df_sensor['time'] = pd.to_datetime(df_sensor['time'])
+                df_sensor.set_index('time', inplace=True)
+
+                # Rename value column to original sensor name
</code_context>

<issue_to_address>
**suggestion (bug_risk):** Time conversion does not specify timezone awareness.

If timestamps are not in UTC or lack timezone info, this may cause alignment issues. Please ensure timezone is set or validated.

```suggestion
                # Convert time column and set as index, ensuring timezone awareness (UTC)
                df_sensor['time'] = pd.to_datetime(df_sensor['time'], utc=True)
                df_sensor.set_index('time', inplace=True)
```
</issue_to_address>

### Comment 4
<location> `src/emhass/retrieve_hass.py:429-430` </location>
<code_context>
+                    continue
+
+                # Handle non-numeric data (same as HA processing)
+                df_sensor[sensor] = pd.to_numeric(df_sensor[sensor], errors='coerce')
+
+                # Create time index for first sensor
</code_context>

<issue_to_address>
**suggestion:** Non-numeric values are coerced to NaN without further handling.

If many values are non-numeric, consider logging a warning or adding explicit handling for cases with a high proportion of NaNs.

```suggestion
                # Handle non-numeric data (same as HA processing)
                df_sensor[sensor] = pd.to_numeric(df_sensor[sensor], errors='coerce')
                # Check proportion of NaNs and log warning if high
                nan_count = df_sensor[sensor].isna().sum()
                total_count = len(df_sensor[sensor])
                if total_count > 0:
                    nan_ratio = nan_count / total_count
                    if nan_ratio > 0.2:
                        self.logger.warning(
                            f"Sensor '{sensor}' has {nan_count}/{total_count} ({nan_ratio:.1%}) non-numeric values coerced to NaN."
                        )
```
</issue_to_address>

### Comment 5
<location> `src/emhass/retrieve_hass.py:435-441` </location>
<code_context>
+                # Create time index for first sensor
+                if i == 0:
+                    # Create complete time range with specified frequency
+                    ts = pd.date_range(
+                        start=df_sensor.index.min(),
+                        end=df_sensor.index.max(),
+                        freq=self.freq
+                    )
+                    df_complete = pd.DataFrame(index=ts)
+                    df_complete = pd.concat([df_complete, df_sensor], axis=1)
+                    self.df_final = df_complete
+                else:
</code_context>

<issue_to_address>
**suggestion:** Time range for the first sensor is based on its data only.

Using only the first sensor's time range may exclude data from other sensors. Please align the time index to cover the full range across all sensors.

Suggested implementation:

```python
                # Collect min and max timestamps for each sensor
                if i == 0:
                    global_min = df_sensor.index.min()
                    global_max = df_sensor.index.max()
                else:
                    global_min = min(global_min, df_sensor.index.min())
                    global_max = max(global_max, df_sensor.index.max())

```

```python
            # After processing all sensors, create the complete time index
            ts = pd.date_range(
                start=global_min,
                end=global_max,
                freq=self.freq
            )
            df_complete = pd.DataFrame(index=ts)
            # You may need to concatenate all sensor dataframes here as needed
            self.df_final = df_complete

            self.logger.debug("InfluxDB integration disabled, using Home Assistant API")

```

You will need to:
1. Move the creation of the complete time index and DataFrame (`df_complete`) outside the sensor loop, after all sensors have been processed.
2. Concatenate all sensor dataframes to `df_complete` as needed, rather than just the first sensor.
3. Ensure that `global_min` and `global_max` are initialized before the loop and updated for each sensor.
</issue_to_address>

### Comment 6
<location> `src/emhass/retrieve_hass.py:457-462` </location>
<code_context>
+            return False
+
+        # Set frequency and validate
+        self.df_final = set_df_index_freq(self.df_final)
+        if self.df_final.index.freq != self.freq:
+            self.logger.warning(
</code_context>

<issue_to_address>
**suggestion (bug_risk):** set_df_index_freq is called without error handling.

Wrap set_df_index_freq in a try/except block and log exceptions to improve error handling.

```suggestion
        # Set frequency and validate
        try:
            self.df_final = set_df_index_freq(self.df_final)
        except Exception as e:
            self.logger.error(f"Exception occurred while setting DataFrame index frequency: {e}")
            return False
        if self.df_final.index.freq != self.freq:
            self.logger.warning(
                f"InfluxDB data frequency ({self.df_final.index.freq}) differs from expected ({self.freq})"
            )
```
</issue_to_address>

### Comment 7
<location> `src/emhass/retrieve_hass.py:323` </location>
<code_context>
         self.var_list = var_list
         return True

+    def get_data_influxdb(
+        self,
+        days_list: pd.date_range,
</code_context>

<issue_to_address>
**issue (complexity):** Consider refactoring get_data_influxdb into a short orchestration loop with focused helper methods for connection, query building, data fetching, and merging.

Here’s one way to collapse the 160-line `get_data_influxdb` into a thin loop + a few focused helpers. Everything stays exactly the same under the hood, but each block is now testable and reads in 3–5 lines:

```python
def get_data_influxdb(self, days_list, var_list) -> bool:
    client = self._init_influx_client()
    if not client:
        return False

    start, end = days_list[0], days_list[-1] + pd.Timedelta(days=1)
    dfs = []
    for sensor in filter(None, var_list):
        df = self._fetch_sensor_df(client, sensor, start, end)
        if df is not None:
            dfs.append(df)

    client.close()
    if not dfs:
        self.logger.error("No data retrieved from InfluxDB")
        return False

    self.df_final = self._merge_dfs(dfs)
    self.var_list = var_list
    self.logger.info(f"InfluxDB data retrieval completed: {self.df_final.shape}")
    return True
```

Below are the helpers—you can fold these into your class:

```python
def _init_influx_client(self):
    try:
        from influxdb import InfluxDBClient
        client = InfluxDBClient(
            host=self.influxdb_host,
            port=self.influxdb_port,
            username=self.influxdb_username or None,
            password=self.influxdb_password or None,
            database=self.influxdb_database,
        )
        client.ping()
        self.logger.debug(f"Connected to InfluxDB @ {self.influxdb_host}:{self.influxdb_port}")
        return client
    except ImportError:
        self.logger.error("pip install influxdb")
    except Exception as e:
        self.logger.error(f"InfluxDB connection failed: {e}")
    return None
```

```python
def _build_query(self, sensor, start, end):
    entity = sensor.removeprefix("sensor.")
    interval = f"{int(self.freq.total_seconds()/60)}m"
    st, et = start.strftime("%Y-%m-%dT%H:%M:%SZ"), end.strftime("%Y-%m-%dT%H:%M:%SZ")
    return f'''
        SELECT mean("value") AS "mean_value"
        FROM "{self.influxdb_database}"."{self.influxdb_retention_policy}"."{self.influxdb_measurement}"
        WHERE time >= '{st}' AND time < '{et}' AND "entity_id"='{entity}'
        GROUP BY time({interval}) FILL(linear)
    '''
```

```python
def _fetch_sensor_df(self, client, sensor, start, end):
    try:
        query = self._build_query(sensor, start, end)
        points = list(client.query(query).get_points())
        if not points:
            self.logger.warning(f"No data for {sensor}")
            return None

        df = pd.DataFrame(points)
        df['time'] = pd.to_datetime(df['time'])
        df.set_index('time', inplace=True)
        df = df.rename(columns={'mean_value': sensor})
        df[sensor] = pd.to_numeric(df[sensor], errors='coerce')

        # ensure a full index on the first sensor
        if not hasattr(self, "_base_index"):
            idx = pd.date_range(df.index.min(), df.index.max(), freq=self.freq)
            self._base_index = pd.DataFrame(index=idx)

        return pd.concat([self._base_index, df], axis=1)
    except Exception as e:
        self.logger.error(f"Query failed for {sensor}: {e}")
        return None
```

```python
def _merge_dfs(self, dfs):
    df = pd.concat(dfs, axis=1)
    df = set_df_index_freq(df)
    if df.index.freq != self.freq:
        self.logger.warning(
            f"Freq mismatch: got {df.index.freq}, expected {self.freq}"
        )
    return df
```

Benefits:

- `get_data_influxdb` is now a 10-line orchestration loop  
- Each helper does one job: connect, build query, fetch/normalize, merge  
- Easier to test/fix individual pieces without wading through the whole method
</issue_to_address>

### Comment 8
<location> `src/emhass/retrieve_hass.py:357` </location>
<code_context>
                username=self.influxdb_username if self.influxdb_username else None,

</code_context>

<issue_to_address>
**suggestion (code-quality):** Replace if-expression with `or` ([`or-if-exp-identity`](https://docs.sourcery.ai/Reference/Rules-and-In-Line-Suggestions/Python/Default-Rules/or-if-exp-identity))

```suggestion
                username=self.influxdb_username or None,
```

<br/><details><summary>Explanation</summary>Here we find ourselves setting a value if it evaluates to `True`, and otherwise
using a default.

The 'After' case is a bit easier to read and avoids the duplication of
`input_currency`.

It works because the left-hand side is evaluated first. If it evaluates to
true then `currency` will be set to this and the right-hand side will not be
evaluated. If it evaluates to false the right-hand side will be evaluated and
`currency` will be set to `DEFAULT_CURRENCY`.
</details>
</issue_to_address>

### Comment 9
<location> `src/emhass/retrieve_hass.py:358` </location>
<code_context>
                password=self.influxdb_password if self.influxdb_password else None,

</code_context>

<issue_to_address>
**suggestion (code-quality):** Replace if-expression with `or` ([`or-if-exp-identity`](https://docs.sourcery.ai/Reference/Rules-and-In-Line-Suggestions/Python/Default-Rules/or-if-exp-identity))

```suggestion
                password=self.influxdb_password or None,
```

<br/><details><summary>Explanation</summary>Here we find ourselves setting a value if it evaluates to `True`, and otherwise
using a default.

The 'After' case is a bit easier to read and avoids the duplication of
`input_currency`.

It works because the left-hand side is evaluated first. If it evaluates to
true then `currency` will be set to this and the right-hand side will not be
evaluated. If it evaluates to false the right-hand side will be evaluated and
`currency` will be set to `DEFAULT_CURRENCY`.
</details>
</issue_to_address>

### Comment 10
<location> `src/emhass/retrieve_hass.py:323` </location>
<code_context>
    def get_data_influxdb(
        self,
        days_list: pd.date_range,
        var_list: list,
    ) -> bool:
        """
        Retrieve data from InfluxDB database.

        This method provides an alternative data source to Home Assistant API,
        enabling longer historical data retention for better machine learning model training.

        :param days_list: A list of days to retrieve data for
        :type days_list: pandas.date_range
        :param var_list: List of sensor entity IDs to retrieve
        :type var_list: list
        :return: Success status of data retrieval
        :rtype: bool
        """
        self.logger.info("Retrieve InfluxDB get data method initiated...")

        try:
            from influxdb import InfluxDBClient
        except ImportError:
            self.logger.error("InfluxDB client not installed. Install with: pip install influxdb")
            return False

        # Remove empty strings from var_list
        var_list = [var for var in var_list if var != ""]

        # Connect to InfluxDB
        try:
            client = InfluxDBClient(
                host=self.influxdb_host,
                port=self.influxdb_port,
                username=self.influxdb_username if self.influxdb_username else None,
                password=self.influxdb_password if self.influxdb_password else None,
                database=self.influxdb_database
            )
            # Test connection
            client.ping()
            self.logger.debug(f"Successfully connected to InfluxDB at {self.influxdb_host}:{self.influxdb_port}")
        except Exception as e:
            self.logger.error(f"Failed to connect to InfluxDB: {e}")
            return False

        # Calculate total time range
        start_time = days_list[0]
        end_time = days_list[-1] + pd.Timedelta(days=1)
        total_days = (end_time - start_time).days

        self.logger.info(f"Retrieving {len(var_list)} sensors over {total_days} days from InfluxDB")
        self.logger.debug(f"Time range: {start_time} to {end_time}")

        self.df_final = pd.DataFrame()

        for i, sensor in enumerate(var_list):
            self.logger.debug(f"Retrieving sensor {i+1}/{len(var_list)}: {sensor}")

            # Convert sensor name: sensor.sec_pac_solar -> sec_pac_solar
            entity_id = sensor.replace('sensor.', '') if sensor.startswith('sensor.') else sensor

            # Convert frequency to InfluxDB interval
            freq_minutes = int(self.freq.total_seconds() / 60)
            interval = f"{freq_minutes}m"

            # Build InfluxQL query - format times properly for InfluxDB
            start_time_str = start_time.strftime('%Y-%m-%dT%H:%M:%SZ')
            end_time_str = end_time.strftime('%Y-%m-%dT%H:%M:%SZ')

            query = f'''
            SELECT mean("value") AS "mean_value"
            FROM "{self.influxdb_database}"."{self.influxdb_retention_policy}"."{self.influxdb_measurement}"
            WHERE time >= '{start_time_str}'
            AND time < '{end_time_str}'
            AND "entity_id"='{entity_id}'
            GROUP BY time({interval}) FILL(linear)
            '''

            self.logger.debug(f"InfluxDB query: {query}")

            try:
                # Execute query
                result = client.query(query)

                # Convert result to points
                points = list(result.get_points())
                if not points:
                    self.logger.warning(f"No data found for sensor: {sensor}")
                    continue

                self.logger.info(f"Retrieved {len(points)} data points for {sensor}")

                # Create DataFrame from points
                df_sensor = pd.DataFrame(points)

                # Convert time column and set as index
                df_sensor['time'] = pd.to_datetime(df_sensor['time'])
                df_sensor.set_index('time', inplace=True)

                # Rename value column to original sensor name
                if 'mean_value' in df_sensor.columns:
                    df_sensor = df_sensor[['mean_value']].rename(columns={'mean_value': sensor})
                else:
                    self.logger.error(f"Expected 'mean_value' column not found for {sensor}")
                    continue

                # Handle non-numeric data (same as HA processing)
                df_sensor[sensor] = pd.to_numeric(df_sensor[sensor], errors='coerce')

                # Create time index for first sensor
                if i == 0:
                    # Create complete time range with specified frequency
                    ts = pd.date_range(
                        start=df_sensor.index.min(),
                        end=df_sensor.index.max(),
                        freq=self.freq
                    )
                    df_complete = pd.DataFrame(index=ts)
                    df_complete = pd.concat([df_complete, df_sensor], axis=1)
                    self.df_final = df_complete
                else:
                    # Add to existing dataframe
                    self.df_final = pd.concat([self.df_final, df_sensor], axis=1)

            except Exception as e:
                self.logger.error(f"Failed to query sensor {sensor}: {e}")
                continue

        client.close()

        if self.df_final.empty:
            self.logger.error("No data retrieved from InfluxDB")
            return False

        # Set frequency and validate
        self.df_final = set_df_index_freq(self.df_final)
        if self.df_final.index.freq != self.freq:
            self.logger.warning(
                f"InfluxDB data frequency ({self.df_final.index.freq}) differs from expected ({self.freq})"
            )

        self.var_list = var_list
        self.logger.info(f"InfluxDB data retrieval completed: {self.df_final.shape}")
        return True

</code_context>

<issue_to_address>
**issue (code-quality):** Low code quality found in RetrieveHass.get\_data\_influxdb - 19% ([`low-code-quality`](https://docs.sourcery.ai/Reference/Default-Rules/comments/low-code-quality/))

<br/><details><summary>Explanation</summary>The quality score for this function is below the quality threshold of 25%.
This score is a combination of the method length, cognitive complexity and working memory.

How can you solve this?

It might be worth refactoring this function to make it shorter and more readable.

- Reduce the function length by extracting pieces of functionality out into
  their own functions. This is the most important thing you can do - ideally a
  function should be less than 10 lines.
- Reduce nesting, perhaps by introducing guard clauses to return early.
- Ensure that variables are tightly scoped, so that code using related concepts
  sits together within the function rather than being scattered.</details>
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

src/emhass/retrieve_hass.py

davidusb-geek · 2025-09-25T20:55:44Z

This is a great feature.
I don't know how easy it would be but we will need some unit testing to approve this.
Also be sure to correctly handling these new parameters.
We have a developer guide to add a new parameter here: https://emhass.readthedocs.io/en/latest/develop.html#adding-a-parameter

- Fix empty days_list handling to prevent IndexError - Change FILL(linear) to FILL(previous) for open-source InfluxDB compatibility - Add timezone awareness with utc=True parameter to pd.to_datetime() - Add NaN ratio warnings when >20% of sensor data is non-numeric - Fix time index to span globally across all sensors instead of just first sensor - Add comprehensive error handling for set_df_index_freq() method - Refactor large get_data_influxdb() method into focused helper functions: * _init_influx_client() for connection setup * _build_influx_query() for InfluxQL query construction * _fetch_sensor_data() for individual sensor data retrieval - Replace if-expressions with cleaner 'or' operators for default values 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>

scruysberghs · 2025-09-25T21:49:40Z

This is a great feature. I don't know how easy it would be but we will need some unit testing to approve this. Also be sure to correctly handling these new parameters. We have a developer guide to add a new parameter here: https://emhass.readthedocs.io/en/latest/develop.html#adding-a-parameter

And it's blazing fast. Just tried a 30 day model fit :

## Why Auto-Discovery is Essential This commit adds intelligent auto-discovery that automatically detects which InfluxDB measurement contains any given Home Assistant entity, regardless of its unit of measurement (W, EUR/kWh, %, A, V, etc.). ### Problem Solved Before this enhancement, users had to manually configure the correct measurement for each sensor type. This was problematic because: 1. **Entity Diversity**: HA entities are stored in different measurements based on their units: - Power sensors → "W" measurement - Price sensors → "EUR/kWh" measurement - Percentage sensors → "%" measurement - Current sensors → "A" measurement 2. **User Complexity**: Users would need to know InfluxDB's internal structure 3. **Configuration Errors**: Wrong measurement selection caused data retrieval failures 4. **Limited Flexibility**: Adding new sensor types required manual configuration ### Solution: Intelligent Auto-Discovery The auto-discovery system uses `SHOW TAG VALUES` queries to: 1. **Scan All Measurements**: Checks priority measurements first (EUR/kWh, W, %, etc.) 2. **Find Entity Location**: Locates the exact measurement containing the target entity 3. **Cache Results**: Stores discovered mappings for performance 4. **Fallback Gracefully**: Uses default measurement if entity not found ### Technical Implementation - Uses `SHOW TAG VALUES FROM "measurement" WITH KEY = "entity_id"` - Implements measurement caching for performance optimization - Provides comprehensive error handling and debug logging - Maintains backward compatibility with manual measurement specification ### Real-World Impact This enables seamless integration with diverse sensor types: - ✅ `sensor.sec_prices_afname` → Auto-detected in "EUR/kWh" - ✅ `sensor.sec_pac_solar` → Auto-detected in "W" - ✅ `sensor.battery_level` → Auto-detected in "%" - ✅ Works with any HA entity regardless of unit ### Additional Improvements - Fixed parameter definitions to use `"input": "secrets.string"` for credentials - Enhanced error handling for empty datasets - Added comprehensive debug logging for troubleshooting - Improved InfluxDB query reliability 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>

## Problem The `unit_load_cost` and `unit_prod_price` sensors were displaying only 2 decimal places in Home Assistant (e.g., 0.13) even though internally EMHASS calculated them with 4 decimals (e.g., 0.1291). This caused precision loss for price-sensitive optimizations where small differences in electricity prices matter. ## Root Cause The `get_attr_data_dict()` function hardcoded 2 decimal places: - Line 716: `vals_list = [str(np.round(i, 2)) ...]` - Line 724: `"state": f"{state:.2f}"` ## Solution Added configurable `decimals` parameter to `get_attr_data_dict()`: - Default: 2 decimals (maintains backward compatibility) - Price sensors: 4 decimals for `unit_load_cost` and `unit_prod_price` - Affects both the state value and forecast attribute lists ## Testing Before: sensor.mpc_unit_load_cost = 0.13 (2 decimals) After: sensor.mpc_unit_load_cost = 0.1291 (4 decimals) Logs show: "Successfully posted to sensor.mpc_unit_load_cost = 0.1291" HA displays: 0.1291 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>

…egration

…stance

davidusb-geek · 2025-10-05T17:15:43Z

Hi @scruysberghs. I've added the unit test for the InfluxDB method and pushed it to a branch on my fork.
You can easily pull my commits into your branch to update this PR.
My branch is: davidusb-geek:feature/influxdb-integration

sonarqubecloud · 2025-10-06T06:47:11Z

Quality Gate failed

Failed conditions
2 Security Hotspots

See analysis details on SonarQube Cloud

scruysberghs · 2025-10-06T06:50:03Z

I've added the unit test for the InfluxDB method and pushed it to a branch on my fork.
You can easily pull my commits into your branch to update this PR.

done (I think).
I'm no github wizzard but I think I managed.
I accidently got the "decimal precision fix" what was meant to be a different PR included here as well, sorry for that.

tbrasser · 2025-10-12T12:31:08Z

I cannot disable influxdb on use_influxdb: false does nothing (addon test)

scruysberghs · 2025-10-13T14:04:49Z

I cannot disable influxdb on use_influxdb: false does nothing (addon test)

strange, not breaking stuff for people not using influxdb was actually the first thing I tested ... I'll have a look

scruysberghs · 2025-10-13T14:17:21Z

@davidusb-geek , make sure not to release until this is cleared up.

Don't have much time right now but it has something to do with
self.use_influxdb = self.params.get("retrieve_hass_conf", {}).get("use_influxdb", False) in src/emhass/retrieve_hass.py

scruysberghs · 2025-10-15T12:35:06Z

I cannot disable influxdb on use_influxdb: false does nothing (addon test)

@tbrasser I can't reproduce here in my dev container. With use_influxdb set to false or empty or "false" it used hass retrieve (ha api calls) if I set the use_influxdb to true and fill in the other parameters it retrieves dta from influxdb.
Can you show me what your config file (and log) looks like when you run use_influxdb: false and see it trying to use influxdb?

tbrasser · 2025-10-15T20:26:08Z

[2025-10-12 13:31:59 +0200] [32] [INFO] Starting gunicorn 23.0.0

[2025-10-12 13:31:59 +0200] [32] [INFO] Listening at: http://0.0.0.0:5000 (32)

[2025-10-12 13:31:59 +0200] [32] [INFO] Using worker: gthread

[2025-10-12 13:31:59 +0200] [33] [INFO] Booting worker with pid: 33

[2025-10-12 13:32:04 +0200] [33] [INFO] Obtaining parameters from config.json:

[2025-10-12 13:45:05 +0200] [33] [ERROR] Failed to connect to InfluxDB: HTTPConnectionPool(host='localhost', port=8086): Max retries exceeded with url: /ping (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f792f18f0e0>: Failed to establish a new connection: [Errno 111] Connection refused'))

[2025-10-12 13:50:05 +0200] [33] [ERROR] Failed to connect to InfluxDB: HTTPConnectionPool(host='localhost', port=8086): Max retries exceeded with url: /ping (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f792f44edb0>: Failed to establish a new connection: [Errno 111] Connection refused'))

[2025-10-12 13:55:05 +0200] [33] [ERROR] Failed to connect to InfluxDB: HTTPConnectionPool(host='localhost', port=8086): Max retries exceeded with url: /ping (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f792feb1e80>: Failed to establish a new connection: [Errno 111] Connection refused'))

[2025-10-12 14:00:05 +0200] [33] [ERROR] Failed to connect to InfluxDB: HTTPConnectionPool(host='localhost', port=8086): Max retries exceeded with url: /ping (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f792f1988f0>: Failed to establish a new connection: [Errno 111] Connection refused'))

[2025-10-12 14:01:01 +0200] [33] [ERROR] Failed to connect to InfluxDB: HTTPConnectionPool(host='localhost', port=8086): Max retries exceeded with url: /ping (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f792c5043e0>: Failed to establish a new connection: [Errno 111] Connection refused'))

[2025-10-12 14:01:01 +0200] [33] [ERROR] Unable to get sensor power photovoltaics, or sensor power load no var loads. Check HA sensors and their daily data

[2025-10-12 14:03:46 +0200] [33] [ERROR] Failed to connect to InfluxDB: HTTPConnectionPool(host='localhost', port=8086): Max retries exceeded with url: /ping (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f792f287860>: Failed to establish a new connection: [Errno 111] Connection refused'))

[2025-10-12 14:03:46 +0200] [33] [ERROR] Unable to get sensor power photovoltaics, or sensor power load no var loads. Check HA sensors and their daily data

[2025-10-12 14:05:05 +0200] [33] [ERROR] Failed to connect to InfluxDB: HTTPConnectionPool(host='localhost', port=8086): Max retries exceeded with url: /ping (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f792f286d50>: Failed to establish a new connection: [Errno 111] Connection refused'))

[2025-10-12 14:05:59 +0200] [33] [ERROR] Failed to connect to InfluxDB: HTTPConnectionPool(host='localhost', port=8086): Max retries exceeded with url: /ping (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f791cb7e6f0>: Failed to establish a new connection: [Errno 111] Connection refused'))

[2025-10-12 14:05:59 +0200] [33] [ERROR] Unable to get sensor power photovoltaics, or sensor power load no var loads. Check HA sensors and their daily data

[2025-10-12 14:10:05 +0200] [33] [ERROR] Failed to connect to InfluxDB: HTTPConnectionPool(host='localhost', port=8086): Max retries exceeded with url: /ping (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f792f286d80>: Failed to establish a new connection: [Errno 111] Connection refused'))

[2025-10-12 14:15:05 +0200] [33] [ERROR] Failed to connect to InfluxDB: HTTPConnectionPool(host='localhost', port=8086): Max retries exceeded with url: /ping (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f792c506e10>: Failed to establish a new connection: [Errno 111] Connection refused'))

[2025-10-12 14:20:05 +0200] [33] [ERROR] Failed to connect to InfluxDB: HTTPConnectionPool(host='localhost', port=8086): Max retries exceeded with url: /ping (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f792f3ebd70>: Failed to establish a new connection: [Errno 111] Connection refused'))

[2025-10-12 14:25:05 +0200] [33] [ERROR] Failed to connect to InfluxDB: HTTPConnectionPool(host='localhost', port=8086): Max retries exceeded with url: /ping (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f7930f45e20>: Failed to establish a new connection: [Errno 111] Connection refused'))

[2025-10-12 14:30:05 +0200] [33] [ERROR] Failed to connect to InfluxDB: HTTPConnectionPool(host='localhost', port=8086): Max retries exceeded with url: /ping (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f791cb7eb40>: Failed to establish a new connection: [Errno 111] Connection refused'))

config was what I used before, the influxdb related entries got automatically added including use_influxdb: false

scruysberghs · 2025-10-16T08:18:20Z

I switched my production machine over to emhass test but I can't reproduce. In the config.json file I used from production the default influxdb paramters were added automaticly:

  "influxdb_database": "homeassistant",
  "influxdb_host": "localhost",
  "influxdb_measurement": "W",
  "influxdb_password": "",
  "influxdb_port": 8086,
  "influxdb_retention_policy": "autogen",
  "influxdb_username": "",

and way at the bottom:

"use_influxdb": false,

But everything works as before. Tried a ML forecast model fit and my MPC Optimiztions work fine still using homeassistant api calls.
[2025-10-16 10:02:00 +0200] [25] [INFO] Retrieve hass get data method initiated...

Even tried disabling my influxdb server alltogether just to make sure.

If I fill in my influxdb parameters with the influxdb server off I get:

[2025-10-16 10:10:00 +0200] [25] [INFO] Retrieve InfluxDB get data method initiated...
[2025-10-16 10:10:00 +0200] [25] [ERROR] Failed to connect to InfluxDB: HTTPConnectionPool(host='192.168.1.2', port=8086): Max retries exceeded with url: /ping (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fa5a42bea80>: Failed to establish a new connection: [Errno 111] Connection refused'

After starting influxdb server addon:

2025-10-16 10:12:00 +0200] [25] [INFO] InfluxDB integration enabled: 192.168.1.2:8086/homeassistant
[2025-10-16 10:12:00 +0200] [25] [INFO] Retrieve InfluxDB get data method initiated...

@tbrasser Can you post the runtime parameters you are using for that call you are running every 5 minutes? No influxdb related runtime parameters in there?

@davidusb-geek are you running test version? Maybe move this conversation over to "Issues" in case other people are affected (no clue as to how many people run emhass test add-on..?). I have not been able to reproduce

tbrasser · 2025-10-16T08:24:10Z

I've since reverted to the non-test addon. I can try again maybe tonight.

my naive mpc call sends:

prediction_horizon: 192 
num_def_loads: 1 
def_total_hours: 
  - 0 
soc_init: 0.16 
soc_final: 0.6 
alpha: 0.25 
beta: 0.75 
load_cost_forecast: '2025-10-15 22:00:00+00:00': 0.2452 '2025-10-15 23:00:00+00:00': 0.241 '2025-10-16 00:00:00+00:00': 0.2433 '2025-10-16 01:00:00+00:00': 0.2381 '2025-10-16 02:00:00+00:00': 0.2365 '2025-10-16 03:00:00+00:00': 0.2374 '2025-10-16 04:00:00+00:00': 0.2464 '2025-10-16 05:00:00+00:00': 0.2653 '2025-10-16 06:00:00+00:00': 0.2977 '2025-10-16 07:00:00+00:00': 0.2837 '2025-10-16 08:00:00+00:00': 0.2514 '2025-10-16 09:00:00+00:00': 0.245 '2025-10-16 10:00:00+00:00': 0.2363 '2025-10-16 11:00:00+00:00': 0.2364 '2025-10-16 12:00:00+00:00': 0.2382 '2025-10-16 13:00:00+00:00': 0.2369 '2025-10-16 14:00:00+00:00': 0.2458 '2025-10-16 15:00:00+00:00': 0.2806 '2025-10-16 16:00:00+00:00': 0.2964 '2025-10-16 17:00:00+00:00': 0.3117 '2025-10-16 18:00:00+00:00': 0.2918 '2025-10-16 19:00:00+00:00': 0.2631 '2025-10-16 20:00:00+00:00': 0.2584 '2025-10-16 21:00:00+00:00': 0.2531 '2025-10-16 22:00:00+00:00': 0.232 '2025-10-16 23:00:00+00:00': 0.226 '2025-10-17 00:00:00+00:00': 0.225 '2025-10-17 01:00:00+00:00': 0.223 '2025-10-17 02:00:00+00:00': 0.226 '2025-10-17 03:00:00+00:00': 0.237 '2025-10-17 04:00:00+00:00': 0.266 '2025-10-17 05:00:00+00:00': 0.292 '2025-10-17 06:00:00+00:00': 0.302 '2025-10-17 07:00:00+00:00': 0.288 '2025-10-17 08:00:00+00:00': 0.272 '2025-10-17 09:00:00+00:00': 0.263 '2025-10-17 10:00:00+00:00': 0.251 '2025-10-17 11:00:00+00:00': 0.247 '2025-10-17 12:00:00+00:00': 0.241 '2025-10-17 13:00:00+00:00': 0.246 '2025-10-17 14:00:00+00:00': 0.252 '2025-10-17 15:00:00+00:00': 0.272 '2025-10-17 16:00:00+00:00': 0.281 '2025-10-17 17:00:00+00:00': 0.287 '2025-10-17 18:00:00+00:00': 0.277 '2025-10-17 19:00:00+00:00': 0.261 '2025-10-17 20:00:00+00:00': 0.253 '2025-10-17 21:00:00+00:00': 0.235 '2025-10-17 22:00:00+00:00': 0.236 '2025-10-17 23:00:00+00:00': 0.225 '2025-10-18 00:00:00+00:00': 0.22 '2025-10-18 01:00:00+00:00': 0.215 '2025-10-18 02:00:00+00:00': 0.213 '2025-10-18 03:00:00+00:00': 0.214 '2025-10-18 04:00:00+00:00': 0.219 '2025-10-18 05:00:00+00:00': 0.23 '2025-10-18 06:00:00+00:00': 0.236 '2025-10-18 07:00:00+00:00': 0.231 '2025-10-18 08:00:00+00:00': 0.219 '2025-10-18 09:00:00+00:00': 0.207 '2025-10-18 10:00:00+00:00': 0.196 '2025-10-18 11:00:00+00:00': 0.184 '2025-10-18 12:00:00+00:00': 0.181 '2025-10-18 13:00:00+00:00': 0.188 '2025-10-18 14:00:00+00:00': 0.202 '2025-10-18 15:00:00+00:00': 0.226 '2025-10-18 16:00:00+00:00': 0.245 '2025-10-18 17:00:00+00:00': 0.252 '2025-10-18 18:00:00+00:00': 0.247 '2025-10-18 19:00:00+00:00': 0.238 '2025-10-18 20:00:00+00:00': 0.234 
sensor_power_load_no_var_loads: sensor.total_house_load_power_watts 
pv_power_forecast: '2025-10-16 00:00:00+02:00': 0 '2025-10-16 00:30:00+02:00': 0 '2025-10-16 01:00:00+02:00': 0 '2025-10-16 01:30:00+02:00': 0 '2025-10-16 02:00:00+02:00': 0 '2025-10-16 02:30:00+02:00': 0 '2025-10-16 03:00:00+02:00': 0 '2025-10-16 03:30:00+02:00': 0 '2025-10-16 04:00:00+02:00': 0 '2025-10-16 04:30:00+02:00': 0 '2025-10-16 05:00:00+02:00': 0 '2025-10-16 05:30:00+02:00': 0 '2025-10-16 06:00:00+02:00': 0 '2025-10-16 06:30:00+02:00': 0 '2025-10-16 07:00:00+02:00': 0 '2025-10-16 07:30:00+02:00': 0 '2025-10-16 08:00:00+02:00': 22 '2025-10-16 08:30:00+02:00': 167 '2025-10-16 09:00:00+02:00': 338 '2025-10-16 09:30:00+02:00': 375 '2025-10-16 10:00:00+02:00': 475 '2025-10-16 10:30:00+02:00': 641 '2025-10-16 11:00:00+02:00': 862 '2025-10-16 11:30:00+02:00': 1177 '2025-10-16 12:00:00+02:00': 1449 '2025-10-16 12:30:00+02:00': 1729 '2025-10-16 13:00:00+02:00': 1897 '2025-10-16 13:30:00+02:00': 1737 '2025-10-16 14:00:00+02:00': 1588 '2025-10-16 14:30:00+02:00': 1396 '2025-10-16 15:00:00+02:00': 1168 '2025-10-16 15:30:00+02:00': 1082 '2025-10-16 16:00:00+02:00': 901 '2025-10-16 16:30:00+02:00': 701 '2025-10-16 17:00:00+02:00': 421 '2025-10-16 17:30:00+02:00': 159 '2025-10-16 18:00:00+02:00': 60 '2025-10-16 18:30:00+02:00': 3 '2025-10-16 19:00:00+02:00': 0 '2025-10-16 19:30:00+02:00': 0 '2025-10-16 20:00:00+02:00': 0 '2025-10-16 20:30:00+02:00': 0 '2025-10-16 21:00:00+02:00': 0 '2025-10-16 21:30:00+02:00': 0 '2025-10-16 22:00:00+02:00': 0 '2025-10-16 22:30:00+02:00': 0 '2025-10-16 23:00:00+02:00': 0 '2025-10-16 23:30:00+02:00': 0 '2025-10-17 00:00:00+02:00': 0 '2025-10-17 00:30:00+02:00': 0 '2025-10-17 01:00:00+02:00': 0 '2025-10-17 01:30:00+02:00': 0 '2025-10-17 02:00:00+02:00': 0 '2025-10-17 02:30:00+02:00': 0 '2025-10-17 03:00:00+02:00': 0 '2025-10-17 03:30:00+02:00': 0 '2025-10-17 04:00:00+02:00': 0 '2025-10-17 04:30:00+02:00': 0 '2025-10-17 05:00:00+02:00': 0 '2025-10-17 05:30:00+02:00': 0 '2025-10-17 06:00:00+02:00': 0 '2025-10-17 06:30:00+02:00': 0 '2025-10-17 07:00:00+02:00': 0 '2025-10-17 07:30:00+02:00': 0 '2025-10-17 08:00:00+02:00': 12 '2025-10-17 08:30:00+02:00': 74 '2025-10-17 09:00:00+02:00': 166 '2025-10-17 09:30:00+02:00': 352 '2025-10-17 10:00:00+02:00': 629 '2025-10-17 10:30:00+02:00': 1127 '2025-10-17 11:00:00+02:00': 1312 '2025-10-17 11:30:00+02:00': 1183 '2025-10-17 12:00:00+02:00': 1069 '2025-10-17 12:30:00+02:00': 1069 '2025-10-17 13:00:00+02:00': 1063 '2025-10-17 13:30:00+02:00': 1041 '2025-10-17 14:00:00+02:00': 1008 '2025-10-17 14:30:00+02:00': 948 '2025-10-17 15:00:00+02:00': 880 '2025-10-17 15:30:00+02:00': 780 '2025-10-17 16:00:00+02:00': 660 '2025-10-17 16:30:00+02:00': 532 '2025-10-17 17:00:00+02:00': 296 '2025-10-17 17:30:00+02:00': 114 '2025-10-17 18:00:00+02:00': 40 '2025-10-17 18:30:00+02:00': 3 '2025-10-17 19:00:00+02:00': 0 '2025-10-17 19:30:00+02:00': 0 '2025-10-17 20:00:00+02:00': 0 '2025-10-17 20:30:00+02:00': 0 '2025-10-17 21:00:00+02:00': 0 '2025-10-17 21:30:00+02:00': 0 '2025-10-17 22:00:00+02:00': 0 '2025-10-17 22:30:00+02:00': 0 '2025-10-17 23:00:00+02:00': 0 '2025-10-17 23:30:00+02:00': 0 '2025-10-18 00:00:00+02:00': 0 '2025-10-18 00:30:00+02:00': 0 '2025-10-18 01:00:00+02:00': 0 '2025-10-18 01:30:00+02:00': 0 '2025-10-18 02:00:00+02:00': 0 '2025-10-18 02:30:00+02:00': 0 '2025-10-18 03:00:00+02:00': 0 '2025-10-18 03:30:00+02:00': 0 '2025-10-18 04:00:00+02:00': 0 '2025-10-18 04:30:00+02:00': 0 '2025-10-18 05:00:00+02:00': 0 '2025-10-18 05:30:00+02:00': 0 '2025-10-18 06:00:00+02:00': 0 '2025-10-18 06:30:00+02:00': 0 '2025-10-18 07:00:00+02:00': 0 '2025-10-18 07:30:00+02:00': 0 '2025-10-18 08:00:00+02:00': 16 '2025-10-18 08:30:00+02:00': 142 '2025-10-18 09:00:00+02:00': 530 '2025-10-18 09:30:00+02:00': 979 '2025-10-18 10:00:00+02:00': 1340 '2025-10-18 10:30:00+02:00': 1666 '2025-10-18 11:00:00+02:00': 1836 '2025-10-18 11:30:00+02:00': 1997 '2025-10-18 12:00:00+02:00': 2102 '2025-10-18 12:30:00+02:00': 2350 '2025-10-18 13:00:00+02:00': 2740 '2025-10-18 13:30:00+02:00': 2719 '2025-10-18 14:00:00+02:00': 2580 '2025-10-18 14:30:00+02:00': 2188 '2025-10-18 15:00:00+02:00': 1705 '2025-10-18 15:30:00+02:00': 2031 '2025-10-18 16:00:00+02:00': 1751 '2025-10-18 16:30:00+02:00': 1207 '2025-10-18 17:00:00+02:00': 1003 '2025-10-18 17:30:00+02:00': 404 '2025-10-18 18:00:00+02:00': 274 '2025-10-18 18:30:00+02:00': 3 '2025-10-18 19:00:00+02:00': 0 '2025-10-18 19:30:00+02:00': 0 '2025-10-18 20:00:00+02:00': 0 '2025-10-18 20:30:00+02:00': 0 '2025-10-18 21:00:00+02:00': 0 '2025-10-18 21:30:00+02:00': 0 '2025-10-18 22:00:00+02:00': 0 '2025-10-18 22:30:00+02:00': 0 '2025-10-18 23:00:00+02:00': 0 '2025-10-18 23:30:00+02:00': 0

sokorn · 2025-10-23T19:52:17Z

I just tried to test the influxdb retrieval but always get an error:

[2025-10-23 19:38:28 +0000] [11] [INFO] InfluxDB integration enabled: https://influxdb.local.<...>.de:8086/home_assistant
[2025-10-23 19:38:28 +0000] [11] [INFO] Retrieve InfluxDB get data method initiated...
[2025-10-23 19:38:28 +0000] [11] [ERROR] Failed to connect to InfluxDB: 404:

I have influxdb v1 setup as an LXC container in Proxmox:

root@influxdb:~# influx -version InfluxDB shell version: 1.12.2

According to the documentation, you have to use the /query endpoint: https://docs.influxdata.com/influxdb/v1/administration/authentication_and_authorization/#authenticate-with-the-influxdb-api

Does this PR only work with influxdb v2?

sokorn · 2025-10-24T13:21:22Z

forget my last comment, I added "https" to the host, which results in the 404 error.
But I encountered another issue. The configuration doesn't handle encrypted connections.
I setup my influxdb with a Let's encrypt certificate, that results in a 400 error:

Failed to connect to InfluxDB: 400: Client sent an HTTP request to an HTTPS server.

According to the documentation the parameter "ssl=True" must be used for this setup (https://influxdb-python.readthedocs.io/en/latest/api-documentation.html). I tested it locally and now I get a connection.
Unfortunately I am not really familiar with python, but I will try to create a PR.

scruysberghs · 2025-10-29T16:39:59Z

@tbrasser The "use_influxdb": false, bug was found (but not merged yet)

load_cost_forecast: '2025-10-15 22:00:00+00:00': 0.2452 '2025-10-15 23:00:00+00:00': 0.241

your runtime parameters had me take another look at the code, looks like I missed the memo on this. Documentation still only speaks of csv file in data folder or a list of the next values as a runtime parameter.
Knowing about that commit would have saved me quite a bit of yaml templating in home assistant 🙄

It even annoyed me enough to have Claude code make a PR with some documentation for this #603

tbrasser · 2025-10-29T16:44:58Z

@tbrasser The "use_influxdb": false, bug was found (but not merged yet)

load_cost_forecast: '2025-10-15 22:00:00+00:00': 0.2452 '2025-10-15 23:00:00+00:00': 0.241
your runtime parameters had me take another look at the code, looks like I missed the memo on this. Documentation still only speaks of csv file in data folder or a list of the next values as a runtime parameter.
Knowing about that commit would have saved me quite a bit of yaml templating in home assistant 🙄

It even annoyed me enough to have Claude code make a PR with some documentation for this #603

yeah it's amazing, found it when wanting to prepare for 15min grid pricing. Now have 4 day 15min horizon which solves so many issues I had

martinarva · 2025-10-30T19:05:26Z

Will there be a influxdb 2 support as well?

scruysberghs · 2025-10-30T21:39:27Z

Will there be a influxdb 2 support as well?

Plan is to stick to 1.x for now (=default version if you install influxdb as HAOS addon)

martinarva · 2025-10-31T07:47:18Z

Will there be a influxdb 2 support as well?

Plan is to stick to 1.x for now (=default version if you install influxdb as HAOS addon)

ok, then i made a huge mistake setting up influxdb 2 enviroment. Is it so much harder to integrate to EMHASS?

scruysberghs · 2025-10-31T08:19:09Z

Is it so much harder to integrate to EMHASS?

It needs another python library and other connection paramters, so it is not just flipping a switch. Feel free to have to look

sourcery-ai bot approved these changes Sep 25, 2025

View reviewed changes

scruysberghs and others added 5 commits September 26, 2025 18:24

Merge branch 'fix/sensor-decimal-precision' into feature/influxdb-int…

3ab8b14

…egration

Add influxdb dependency to pyproject.toml

5e2114d

Added a new method for testing influxdb data retrieve using mocked in…

e0e867b

…stance

Updated documentation with nex influxdb feature

3313363

davidusb-geek merged commit 55826db into davidusb-geek:master Oct 8, 2025
13 of 18 checks passed

scruysberghs mentioned this pull request Oct 30, 2025

Feature Request: Allow EMHASS to use InfluxDB as a historical data source for forecasting #606

Closed

Add InfluxDB integration for enhanced historical data retrieval #586

Add InfluxDB integration for enhanced historical data retrieval #586

Uh oh!

Conversation

scruysberghs commented Sep 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

New Features:

Configuration:

Implementation:

Testing:

Summary by Sourcery

Uh oh!

sourcery-ai bot commented Sep 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviewer's Guide

Sequence diagram for data retrieval with InfluxDB integration

Class diagram for RetrieveHass with InfluxDB integration

File-Level Changes

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

davidusb-geek commented Sep 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

scruysberghs commented Sep 25, 2025

Uh oh!

davidusb-geek commented Oct 5, 2025

Uh oh!

sonarqubecloud bot commented Oct 6, 2025

Quality Gate failed

Uh oh!

scruysberghs commented Oct 6, 2025

Uh oh!

Uh oh!

tbrasser commented Oct 12, 2025

Uh oh!

scruysberghs commented Oct 13, 2025

Uh oh!

scruysberghs commented Oct 13, 2025

Uh oh!

scruysberghs commented Oct 15, 2025

Uh oh!

tbrasser commented Oct 15, 2025

Uh oh!

scruysberghs commented Oct 16, 2025

Uh oh!

tbrasser commented Oct 16, 2025

Uh oh!

sokorn commented Oct 23, 2025

Uh oh!

sokorn commented Oct 24, 2025

Uh oh!

scruysberghs commented Oct 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tbrasser commented Oct 29, 2025

Uh oh!

martinarva commented Oct 30, 2025

Uh oh!

scruysberghs commented Oct 30, 2025

Uh oh!

martinarva commented Oct 31, 2025

Uh oh!

scruysberghs commented Oct 31, 2025

Uh oh!

Reviewers

scruysberghs commented Sep 25, 2025 •

edited

Loading

sourcery-ai bot commented Sep 25, 2025 •

edited

Loading

davidusb-geek commented Sep 25, 2025 •

edited

Loading

scruysberghs commented Oct 29, 2025 •

edited

Loading