ES score normalization
## PR: Normalize and Standardize Elasticsearch Search Scores
### Problem / What Was Missing
- **Inconsistent Scoring:**
Elasticsearch’s raw `_score` is not normalized and varies widely between queries, indices, and even similar queries. This caused:
- Difficulty comparing scores across different queries or result sets.
- Unstable thresholds for “high confidence” or “best match.”
- Confusing or misleading confidence displays for users.
- **Downstream Usage Issues:**
- Confidence buckets and thresholds were based on the raw `_score`, which is not absolute.
- The API and UI only exposed the raw score, not a normalized or percentage-based value.
---
### What Was Implemented
#### 1. **Score Normalization**
- **Min-Max Normalization:**
For each search result set, the code now computes the minimum and maximum `_score` values. Each result’s score is normalized to a 0–1 range:
```
normalized_score = (raw_score - min_score) / (max_score - min_score)
```
- Handles edge cases where all scores are the same.
#### 2. **API/Serializer Enhancements**
- **Expose Both Scores:**
The API now returns both the raw `_score` and the normalized score (`search_score` and `search_normalized_score`).
- **Confidence Calculation:**
The `search_confidence` field is now based on the normalized score, providing a consistent percentage (e.g., “87.5%”) regardless of the raw score range.
#### 3. **Downstream Logic Updates**
- **Thresholds and Buckets:**
All logic for “high confidence,” “very high match,” and bucketing now uses the normalized score, so thresholds are stable (e.g., 0.8 always means “top 20%”).
- **Legacy Fallback:**
If a normalized score is not available, the code falls back to the old raw score logic.
#### 4. **Documentation in Code**
- **Comments and Structure:**
The code is now clear about which score is being used and why, making it easier for future maintainers to understand the normalization process.
---
### Why This Makes Scores More Consistent
- **Stable Range:**
All scores are now in a 0–1 range, so thresholds and confidence levels are meaningful and comparable across queries.
- **User-Friendly Confidence:**
Users and downstream consumers can interpret confidence as a percentage, not an arbitrary number.
- **Easier Tuning:**
Product and engineering teams can set thresholds (e.g., “show only results with confidence > 70%”) without worrying about the quirks of Elasticsearch’s raw scoring.
- **Future-Proof:**
If the underlying Elasticsearch configuration changes, the normalization ensures the API and UI remain stable.
---
### Summary Table
| Field | Before (raw) | After (normalized) |
|---------------------------|--------------|--------------------|
| `search_score` | Raw float | Raw float |
| `search_normalized_score` | N/A | 0–1 float |
| `search_confidence` | % of max raw | % of normalized |
| Thresholds/Buckets | Raw-based | Normalized-based |
---
**In summary:**
This PR makes search scoring more robust, interpretable, and consistent for all users and downstream systems.