AI-Supported Import: End-to-End Process Overview

Overview

With the AI-supported import of consumption data, you can upload your existing emission activity data from MS Excel and have Lucanet's AI automatically convert each row into a structured GHG footprint suggestion — complete with the correct emission factor, scope classification, and CO₂ equivalent values (see also Filling Out the Emissions Module).

The following sections provide an overview of the process, how the AI determines certain values to be included in the footprint suggestions, and an explanation of the AI confidence levels displayed.

Process Overview

You upload one MS Excel file at a time. The AI processes the file and presents you with a set of suggested footprints for review. You then review, adjust where necessary, and accept the footprint suggestions from the current file before you can upload the next one. This ensures data quality and gives you full control over every footprint before it is included in your GHG dataset. The AI processes all relevant Excel sheets within a single uploaded file. It reads each Excel sheet independently and skips rows that represent totals or aggregated summaries.

Traceability

Every suggested footprint is linked back to its exact origin in your Excel file, i.e., you can see both the sheet name and the original row number from your MS Excel file. This makes it straightforward to cross-reference the AI's output with your source data, verify the numbers, and trace any footprint back to where it came from.

Individual Steps of the AI Process

When you upload your Excel file, the AI works through a structured, multi-step process to transform each activity into a footprint suggestion.

Retrieve valid unit types
Before doing anything else, the AI fetches the complete list of valid unit types from the emission factor database (e.g., Volume, Distance, Weight, Money, Energy). This is a prerequisite step that ensures every subsequent search uses only valid, recognized measurement categories.

Read and interpret the file content
The AI reads each row across all relevant Excel sheets. For each activity it identifies:

  • The activity type (e.g., fuel combustion, electricity, freight, business travel)
  • The geographic location (country or region)
  • The unit of measurement (liters, km, kWh, EUR, etc.)
  • The quantity, interpreted in the locale-specific number format of the source data
  • The reporting year — detected from the file content or sheet context
  • The original Excel row number, which is appended to every row in the processed data for full traceability


The AI also automatically detects the report year and default region from the file. This data is used as a reference for all subsequent emission factor searches: The reporting year determines which factor vintages are valid, and the detected region anchors the geographic search.

Search for emission factors using a 4-phase strategy
For each activity, the AI conducts a structured search within our offered databases and your custom emission factors:

  • Phase 1 — Discovery: A broad search with minimal constraints (activity keywords and unit type, no region filter) to understand what data is available and learn the naming patterns, available regions, and data sources in the database.
  • Phase 2 — Region grouping: Based on the detected country, the AI builds geographic tiers — for example, for a German activity: exact Germany → German-speaking cluster (DE, AT, CH) → Central Europe → EU-wide → Global.
  • Phase 3 — Multi-region search: The AI searches across a geographic cluster in a single query, which is more efficient and automatically provides built-in geographic fallbacks.
  • Phase 4 — Progressive refinement: If results are insufficient, the AI iterates through broader tiers until suitable candidates are found. If too many results are returned (more than 200), the AI adds more specific keywords or switches to a single-region search to reduce ambiguity.

Select the best-matching emission factor
When multiple potential factors are found, the AI applies a clear priority order:

  1. Correct unit type — Must match the activity's measurement category
  2. Activity match — The emission factor description must align with the activity
  3. Geographic match — Exact country preferred, then geographic cluster, then continent, then global
  4. Most recent year — The most up-to-date factor is preferred, but the factor year must not be newer than the reporting year (future data is methodologically invalid for historical reporting)
  5. Specific unit — The exact unit (e.g., liters vs. cubic meters) must match

Map and structure the footprint
The selected emission factor is mapped to:

  • The correct GHG scope (Scope 1, 2, or 3)
  • The appropriate category (e.g., Mobile Combustion, Business Travel, Electricity, Purchase of Goods and Services)
  • The correct parameter structure — including handling of compound unit types (e.g., weight over distance for freight, passengers over distance for flights). For compound types where both values must be present and cannot be assumed, the AI flags the row as incomplete rather than inventing a missing value.
AI Confidence Levels

For every suggested footprint, the AI calculates a confidence score that tells you how certain it is about its generated data. This helps you focus your review on the entries that need the most attention. The overall confidence score is calculated from four weighted factors:


Factor

Description


Semantic match quality

  • Weighted with 40%
  • Measures how closely your source text matches the emission factor's activity description:
    • 90–100% — Exact keyword match (e.g., diesel maps directly to a diesel emission factor)
    • 70–85% — Synonym or related term match (e.g., fuel oil mapped to diesel)
    • 50–70% — Category-level match only (e.g., vehicle fuel mapped to diesel)
    • 30–50% — Inferred from context (e.g., fleet costs used to infer diesel)
    • 0–30% — No clear match; a fallback factor was used

Region match

  • Weighted with 30%
  • Measures how precisely the emission factor's geography matches your data's location
    • 100% — Exact country match (e.g., a German office matched to a German emission factor)
    • 80–90% — Country within a broader region (e.g., Germany matched to a DACH factor), or no region detected but your default region was used
    • 70–80% — Continent-level match (e.g., Germany matched to an EU/Europe factor)
    • 50–70% — No regional overlap; a global factor was used as the best available match
    • 20–40% — No region detected at all; a global average was applied
    • 0–20% — Geographic mismatch (e.g., a German activity matched to an Australian factor)

Year match

  • Weighted with 15%
  • Measures how current the emission factor data is relative to your reporting year
    • 100% — The emission factor year matches your reporting year exactly
    • 80–90% — The emission factor is 1–5 years older than your reporting year
    • 60–80% — The emission factor is 6–20 years older
    • 0% — The emission factor is newer than your reporting year (methodologically invalid; future data cannot be used for historical reporting)

Emission factor specificity

  • Weighted with 15%
  • Measures how many potential factors the AI had to choose from (fewer = more confident)
    • 100% — Only 1 matching factor found (unambiguous)
    • 70% — 2–10 matching factors (reasonable disambiguation)
    • 50% — More than 10 matching factors (high ambiguity)

Formula: confidence = (semantic × 0.40) + (region × 0.30) + (year × 0.15) + (specificity × 0.15)

Confidence Tiers

The combined score is displayed as a color-coded indicator for each suggested footprint:


Color

Description


Green

  • High confidence 
  • Score ≥ 80%

Yellow

  • Review recommended
  • Score = 50–79%

Red

  • Needs attention
  • Score < 50%

Important Notes

  • For compound activity types (e.g., freight measured in weight over distance), the AI requires both values to be available in your source data. If data is incomplete, the AI will flag the row rather than invent a missing value.
  • Total (aggregated) rows in your Excel file are automatically excluded from processing.
  • Only one file can be processed at a time. The next upload only becomes available once the current file's proposals have been reviewed and accepted.
Contact Us