Specification: Model Mapping Layer

Purpose

Defines how raw JSON or text retrieved from GitLab is converted into in-memory domain objects (group, project, issue, readme) that are then consumed by the Data Collector and downstream renderers.

1. General Mapping Rules

Ignore unknown fields present in source JSON; forward compatibility is preferred.
Mandatory target attributes that are missing in source → raise mapping error unless explicitly documented below as optional.
Optional attributes absent in source become null / language-specific None.
Date-time strings ending in Z (UTC) or explicit offset must be parsed into a timezone-aware datetime object using UTC.
Error Handling: Missing data should be handled gracefully with logger warnings and render in [!warning]-callouts. Incorrect data should cause logger errors and render [!danger] callouts when rendered in either markdown or quarto.

2. Entity-Specific Details

2.1 Group

Source JSON Field	Target Attribute	Notes
`id` (string/int)	id (string)	Preserve as string for consistency.
`name`	name
others	ignored

2.2 Project

Source Field	Target Attr.	Notes
`id`	id (int)
`name`	name
`path_with_namespace`	path_with_namespace	Optional
`default_branch`	default_branch	Optional
`last_activity_at`	last_activity_at (datetime)	Parse ISO timestamp to UTC (§1).
`namespace`	namespace (Group)	Try to parse as Group-object or None.
others	ignored

2.3 Issue

Source Field	Target Attr.	Notes
`id`	id (int)	Must be interpretable as int, throw error if not possible.
`iid`	iid (int)	Must be interpretable as int, throw error if not possible.
`project_id`	project_id (int)	Must be interpretable as int, throw error if not possible.
`title`	title
`state`	state	Optioal
`web_url`	web_url	Optional
`description`	description	Optional
others	ignored

2.4 Readme

Source	Target Attr.	Notes
Raw string passed in	content	First paragraph (verbatim excerpt) - see §3.1
Raw string passed in	todo	TODO sections (verbatim excerpts) - see §3.2
Raw string passed in	full_content	Copy of the unprocessed string passed in.
Raw string passed in	extra	ReadmeExtract object containing all interpreted/extracted information from frontmatter.

3. Content Extraction Rules

3.1 First Paragraph Extraction

Definition: “First paragraph” is the first non-empty, non-heading part of a README with text. It should also include the last of each increasing level headers leading to the text if present (skipping non-existent). If the readme is empty, “first paragraph” is also an empty string.

Processing Steps:

Skip YAML frontmatter (lines between --- markers)
Find the first non-empty line that is not a heading (doesn’t start with #)
Include all heading lines that form a path to this text, starting from the highest level
Include the text paragraph itself
Preserve original formatting and line breaks

Examples:

# Project Title
## Section

### Subsection to be ignored, as it is not in the direct Path

### Subsection

This is the first paragraph.

Result: "# Project Title\n## Section\n\n### Subsection\n\nThis is the first paragraph."

# Project Title
This is the first paragraph.

Result: "# Project Title\nThis is the first paragraph."

3.2 TODO Section Extraction

Definition: Extract verbatim sections containing TODO-related keywords as complete markdown blocks.

Processing Steps:

Use keywords from settings.todo_keywords (see Configuration Settings Layer)
Find sections with headings containing one of these keywords (case-insensitive). This means the line begins with at least one ‘#’ with the number of ‘#’ being the level of the heading.
Extract complete sections including headings and all content until the next heading of equal or higher level (as defined in 2.)
Preserve original markdown formatting
If multiple sections match, concatenate them with double newlines

Examples:

## TODO
- [x] Setup project
- [ ] Add docs

## Other Section
Content here.

## Tasks
- [ ] More work

Result: "## TODO\n- [x] Setup project\n- [ ] Add docs\n\n## Tasks\n- [ ] More work"

4. Derived Metadata Extraction

When a README contains YAML front-matter, it is extracted to an extra field (ReadmeExtract object). See spec for details.

5. Error Handling

Principle: Maximum Effort. Continue after logging/replacing with defaults.

5.1 Missing Data (Graceful Handling)

All Log entries mentioned here have WARNING-Level

Missing fields: Set to None or empty values, log message
Empty README: Set content to empty string, log message
No TODO sections: Set todo to None
No first paragraph: Set content to empty string, log message

5.2 Incorrect Data (Error Handling)

All Log entries mentioned here have ERROR-Level

Invalid JSON: Log error, raise mapping error
Invalid YAML frontmatter: Log error, continue with empty frontmatter
Type coercion failures: Log error, preserve original value
Malformed content: Log error, extract what’s possible

5.3 Rendering Behavior

Missing data: Render [!warning] callout with descriptive message
Incorrect data: Render [!danger] callout with error details

6. Non-Goals

This layer performs no network I/O.
Sorting, grouping, rendering are handled elsewhere.
The table follows all rules in Table Rendering & UI (m-dash placeholder, custom labels …).
The consumer applies sorting/grouping according to their own needs (see Table Sorting).
Consumes sorted & grouped data from Table Rendering & UI.
Reuses placeholders and cell formatting rules defined there.
Reads metadata extracted by the Model Mapping and enriched by the Data Collector.