The BMS estimates state of charge using three algorithms that each fail in different ways — the engineering challenge is combining them so the failures do not overlap.
- SOC estimation uses three algorithms in combination: Coulomb counting for dynamic tracking, OCV lookup for periodic anchoring, and a Kalman filter for optimal fusion.
- Coulomb counting is the backbone — accurate during operation but drifts without periodic correction.
- LFP's flat OCV curve makes voltage-based correction nearly useless in the 20–90% SOC range; LFP BMS must anchor at endpoints only.
- The Extended Kalman Filter fuses coulomb counting with voltage correction and can recover from a wrong initial SOC estimate — unlike pure coulomb counting.
- Protection thresholds use tiered hysteresis (warning → derate → disconnect) to prevent nuisance trips while maintaining fast response to genuine faults.
The BMS must know things it cannot directly measure. State of Charge is not a physical quantity you can probe with a sensor — it is an internal thermodynamic state accessible only through imperfect proxies (voltage, current, temperature). State of Health is even more abstract — it describes the cumulative result of thousands of electrochemical aging events that individually leave no measurable trace. This article explains how the algorithms behind these estimates work, where they fail, and how protection logic translates estimates into safe operating decisions.
Coulomb Counting: The Integration Engine
Coulomb counting is the backbone of every production BMS. The equation is simple:
SOC(t) = SOC(t₀) + (1/Q_rated) × ∫[η × I(τ)]dτwhere η is the coulombic efficiency (~0.99 for Li-ion, slightly less during fast charging), I is the measured current (positive for discharge), and Q_rated is the total cell capacity.
In discrete firmware implementation:
SOC[k] = SOC[k-1] - (η × I[k] × Δt) / Q_ratedWhat it does well: Tracks SOC continuously during active operation. Responds immediately to current changes. Works for any chemistry. Computationally trivial.
Where it fails: The integration accumulates error. The starting SOC must be known (or estimated). If Q_rated is wrong (battery has aged), every percentage is systematically off. The current sensor's offset and gain errors integrate into SOC drift over time.
def coulomb_count(soc_prev: float, I_A: float, dt_s: float, Q_Ah: float) -> float:
"""SOC update via Coulomb counting. I_A > 0 = discharge."""
delta_soc = (I_A * dt_s) / (Q_Ah * 3600.0)
return float(max(0.0, min(1.0, soc_prev - delta_soc)))
def ocv_lookup(soc: float, ocv_table: list) -> float:
"""Linear interpolation from (soc, ocv_V) pairs."""
for i in range(len(ocv_table) - 1):
s0, v0 = ocv_table[i]
s1, v1 = ocv_table[i + 1]
if s0 <= soc <= s1:
return v0 + (v1 - v0) * (soc - s0) / (s1 - s0)
return ocv_table[-1][1]
# Example: 1C discharge on a 50 Ah NMC pack
nmc_ocv = [(0.0, 3.00), (0.25, 3.65), (0.50, 3.82), (0.75, 3.97), (1.00, 4.18)]
soc = 0.80
for step in range(5):
soc = coulomb_count(soc, I_A=50.0, dt_s=60.0, Q_Ah=50.0)
print(f"Step {step+1}: SOC={soc:.4f} OCV={ocv_lookup(soc, nmc_ocv):.3f} V")During charging or discharging, the terminal voltage includes overpotential contributions from internal resistance, diffusion, and reaction kinetics — which can add or subtract hundreds of millivolts from the open-circuit value. Only after the battery has rested for 30+ minutes does the terminal voltage approach the true OCV that maps reliably to SOC. For LFP chemistry, even resting voltage changes less than 50 mV across 70% of the SOC range, making voltage a fundamentally poor SOC indicator for this chemistry regardless of rest state.
OCV Lookup: The Periodic Anchor
Open Circuit Voltage (OCV) is the battery's terminal voltage in thermodynamic equilibrium — after sufficient rest. Each SOC corresponds to a specific OCV (at a given temperature). The BMS stores this OCV-SOC relationship as a lookup table, characterised during cell development.
Rest detection: The BMS monitors |dV/dt| — the rate of voltage change. When this falls below approximately 1 mV/minute, the BMS considers the voltage stabilised and applies an OCV correction.
The LFP plateau problem: LFP's OCV changes by only ~50 mV across the 15–90% SOC range. Voltage measurement noise of 1–2 mV translates directly to 1–2% SOC uncertainty. Hysteresis (OCV is different on charge vs discharge approach) adds another 3–8% uncertainty. In the plateau, OCV correction for LFP is largely useless — it cannot improve on what coulomb counting already estimated.
| Chemistry | OCV slope (20–80% SOC) | OCV anchor usefulness | Required strategy |
|---|---|---|---|
| NMC 811 | ~300–400 mV over 60% SOC | Excellent — strong correction signal | CC primary, OCV correction frequently |
| NMC 622/532 | ~250–350 mV over 60% SOC | Good | CC primary, OCV correction at rest |
| LFP | ~40–60 mV over 75% SOC | Poor in mid-range | CC primary, OCV only for endpoint anchoring |
| NMC 111 | ~200–300 mV over 60% SOC | Moderate | CC primary, OCV at rest |
The Kalman Filter: Optimal Fusion
The Extended Kalman Filter is the standard algorithm for combining coulomb counting with voltage-based correction in a statistically optimal way.
The key insight: The Kalman filter is not a replacement for coulomb counting or OCV lookup — it is a framework for fusing them. The filter:
- Predicts next SOC from coulomb counting (process model)
- Predicts what terminal voltage the battery should show at that SOC (measurement model)
- Compares predicted to measured terminal voltage
- Updates the SOC estimate by the discrepancy, weighted by how much it trusts the model vs the measurements
The weighting is the Kalman gain K. If the model is trusted more (R large), corrections are small. If measurements are trusted more (Q large), corrections are aggressive.
For NMC, the Kalman gain is high in the mid-SOC range because the OCV slope provides strong observability — a 10 mV terminal voltage error at 50% SOC corresponds to a well-defined SOC correction. For LFP in the plateau, the OCV slope is near-zero, the gain collapses to near-zero, and the filter essentially behaves as pure coulomb counting. This is mathematically correct — voltage measurements really do carry no SOC information in the LFP plateau — but it means LFP BMS cannot self-correct during operation between endpoint rest states.
import numpy as np
class ExtendedKalmanFilter:
"""1-state SOC EKF for lithium-ion cells."""
def __init__(self, Q_noise: float = 1e-5, R_noise: float = 1e-3):
self.x = np.array([0.8]) # initial SOC estimate
self.P = np.array([[0.1]]) # initial error covariance
self.Q = np.array([[Q_noise]])
self.R = np.array([[R_noise]])
def predict(self, I_A: float, dt_s: float, Q_Ah: float):
"""Time update — integrate current measurement."""
self.x[0] -= (I_A * dt_s) / (Q_Ah * 3600.0)
self.x[0] = np.clip(self.x[0], 0.0, 1.0)
self.P = self.P + self.Q # F=1 for this model
def update(self, V_meas: float, ocv_table: list):
"""Measurement update — correct with terminal voltage."""
V_pred = ocv_lookup(self.x[0], ocv_table)
H = np.array([[1.0]]) # linearised dV/dSOC (simplification)
S = H @ self.P @ H.T + self.R
K = self.P @ H.T @ np.linalg.inv(S) # Kalman gain
self.x = self.x + K @ (V_meas - V_pred)
self.P = (np.eye(1) - K @ H) @ self.PThe Kalman gain determines how much the filter trusts a new voltage measurement versus its own model prediction. It is proportional to the OCV slope (dOCV/dSOC) — a steep slope means voltage carries strong SOC information, and the gain is high. For LFP between 20–90% SOC, the OCV slope is near-zero (less than 1 mV per 1% SOC), so the gain collapses to near-zero and the filter essentially ignores voltage measurements in that range. This is mathematically correct behaviour — voltage genuinely carries no SOC information in the LFP plateau — but it means the filter cannot self-correct during operation for LFP packs.
Protection Algorithms: From Estimate to Action
Protection logic translates continuous sensor measurements and state estimates into discrete actions (derate, warn, disconnect). The architecture requires careful threshold design:
| Trigger | Level 1 (Warning) | Level 2 (Derate) | Level 3 (Disconnect) |
|---|---|---|---|
| Cell overvoltage (NMC) | >4.18 V | >4.20 V | >4.25 V |
| Cell undervoltage (NMC) | <3.1 V | <3.0 V | <2.9 V |
| Cell overvoltage (LFP) | >3.62 V | >3.65 V | >3.70 V |
| Cell undervoltage (LFP) | <2.6 V | <2.5 V | <2.4 V |
| Overtemperature (charging) | >42°C | >47°C | >50°C cell temp |
| Overtemperature (discharging) | >55°C | >58°C | >62°C cell temp |
| Undertemp (charging) | <5°C | <2°C | <0°C |
| Current (discharge) | >2C | >3C | >5C |
Hysteresis in the thresholds prevents oscillation — once a fault is triggered at Level 2, it does not clear until the parameter falls well below the Level 1 threshold. Timers prevent nuisance trips from brief transients (overvoltage during a regenerative braking spike that lasts 50 ms should not trigger a hard disconnect).
/* BMS protection thresholds -- NMC 21700 pack */
#define CELL_OV_WARN_MV 4180U /* Level-1: reduce charge current */
#define CELL_OV_FAULT_MV 4200U /* Level-2: stop charging */
#define CELL_UV_WARN_MV 3000U /* Level-1: derate discharge */
#define CELL_UV_FAULT_MV 2800U /* Level-2: disconnect load */
#define CELL_OT_CHARGE_DC 450 /* deg-C x10: max charge temp 45C */
#define CELL_OT_DISCHG_DC 600 /* deg-C x10: max dischg temp 60C */
typedef struct {
uint16_t cell_mv;
int16_t temp_dc; /* temperature in 0.1 C units */
uint8_t fault_flags; /* bitmask: bit0=OV, bit1=UV, bit2=OT */
} CellState_t;
uint8_t bms_check_cell(const CellState_t *c) {
uint8_t f = 0;
if (c->cell_mv >= CELL_OV_FAULT_MV) f |= 0x01;
if (c->cell_mv <= CELL_UV_FAULT_MV) f |= 0x02;
if (c->temp_dc >= CELL_OT_DISCHG_DC) f |= 0x04;
return f;
}Each parameter (voltage, temperature, current) has three thresholds — warning, derate, and hard disconnect — with hysteresis between them. A nuisance trip from a brief regenerative braking voltage spike is prevented by time delays (50–100 ms) at the warning and derate levels; only a sustained exceedance triggers the hard disconnect. Hysteresis prevents oscillation: once a Level 2 fault is triggered at 4.20 V, the fault does not clear until voltage falls well below the Level 1 threshold of 4.18 V. This architecture allows fast response to genuine faults without false trips from transient load spikes.
Cell Balancing Algorithms
Passive balancing is simpler to implement but requires balancing algorithm decisions:
When to balance: During charging only (most common — avoids wasted energy during driving), during charging and discharging (more effective but more complex), or continuously.
Threshold: Start balancing cell N when V_N > V_min_in_pack + 5–10 mV. This prevents trivial differences from triggering balancing constantly.
End condition: Stop balancing when all cells are within 3–5 mV of each other, or when the total pack is at the target charge level.
At 1C charging with passive balancing (50 mA bleed current per cell), the balancing current is only 5% of the charge current. For a large cell-to-cell imbalance (>5% SOC difference), passive balancing may require several full charge cycles to correct. This is why vehicles with passive BMS need to be charged to 100% occasionally — partial charging every day never gives the balancing algorithm enough time at the voltage plateau to correct accumulated imbalance.
Fault Logging and Diagnostics
A production BMS maintains a fault log — typically stored in non-volatile memory — recording every protection event with a timestamp and the measured value that triggered it. This log is essential for:
- Warranty claims: Proving or disproving that a cell failure was caused by user abuse (overcharge, overdischarge) vs manufacturing defect
- Fleet management: Early warning of cells or packs that are trending toward their protection thresholds before hard faults occur
- Regulatory compliance: AIS-156 requires fault documentation and data retention for incident investigation
The fault log is one of the most practically valuable features of a BMS that is often underutilised in Indian fleet deployments. Most BMS units in Indian commercial EVs have comprehensive fault logging capability; most fleet operators never retrieve or analyse the data.
Key Takeaways
- SOC estimation combines three algorithms: coulomb counting (dynamic tracking), OCV lookup (periodic anchoring), and Kalman filter (optimal fusion). Each fails in different conditions; their combination covers each other's failure modes.
- LFP's flat OCV curve makes OCV-based correction nearly useless in the 20–90% SOC range. LFP BMS must rely on coulomb counting as primary, with OCV correction only at rest states near the endpoints.
- Protection logic uses tiered thresholds with hysteresis and time delays to avoid nuisance trips while maintaining fast response to genuine fault conditions. The specific voltage thresholds differ between NMC and LFP by chemistry-specific limits.
- Passive balancing at 50 mA requires many full charge cycles to correct large cell imbalances. Occasional full charges to 100% are operationally necessary for packs with passive balancing, not just for BMS calibration.
- BMS fault logging is a powerful diagnostic tool that is systematically underutilised in Indian commercial EV fleets. Fleet operators who retrieve and analyse BMS fault data regularly catch degrading cells and protection events before they cause vehicle downtime.
Part of the bms-design Series
Frequently Asked Questions
Why can't the BMS simply read the battery voltage to know the SOC?
What is the difference between passive and active cell balancing, and which is better?
What is the Kalman filter doing differently from coulomb counting?
How does the protection algorithm decide when to trigger a fault vs when to derate?
What is State of Health (SOH) and how is it different from SOC?
References
- Plett, G.L. (2004) — Extended Kalman filtering for battery management systems, Journal of Power Sources
- Ng, K.S. et al. (2009) — Enhanced coulomb counting method for state-of-charge estimation, Applied Energy
- Battery University — BU-802: What Causes Capacity Loss, BU-903: How to Measure State-of-Charge
- Hannan, M.A. et al. (2017) — A review of lithium-ion battery state of charge estimation, Renewable and Sustainable Energy Reviews