Specification: Model Mapping Layer

Purpose

Defines how raw JSON or text retrieved from GitLab is converted into in-memory domain objects (group, project, issue, readme) that are then consumed by the Data Collector and downstream renderers.


1. General Mapping Rules

  • Ignore unknown fields present in source JSON; forward compatibility is preferred.

  • Mandatory target attributes that are missing in source → raise mapping error unless explicitly documented below as optional.

  • Optional attributes absent in source become null / language-specific None.

  • Date-time strings ending in Z (UTC) or explicit offset must be parsed into a timezone-aware datetime object using UTC.

  • Error Handling: Missing data should be handled gracefully with logger warnings and render in [!warning]-callouts. Incorrect data should cause logger errors and render [!danger] callouts when rendered in either markdown or quarto.


2. Entity-Specific Details

2.1 Group

Source JSON Field

Target Attribute

Notes

id (string/int)

id (string)

Preserve as string for consistency.

name

name

others

ignored

2.2 Project

Source Field

Target Attr.

Notes

id

id (int)

name

name

path_with_namespace

path_with_namespace

Optional

default_branch

default_branch

Optional

last_activity_at

last_activity_at (datetime)

Parse ISO timestamp to UTC (§1).

namespace

namespace (Group)

Try to parse as Group-object or None.

others

ignored

2.3 Issue

Source Field

Target Attr.

Notes

id

id (int)

Must be interpretable as int, throw error if not possible.

iid

iid (int)

Must be interpretable as int, throw error if not possible.

project_id

project_id (int)

Must be interpretable as int, throw error if not possible.

title

title

state

state

Optioal

web_url

web_url

Optional

description

description

Optional

others

ignored

2.4 Readme

Source

Target Attr.

Notes

Raw string passed in

content

First paragraph (verbatim excerpt) - see §3.1

Raw string passed in

todo

TODO sections (verbatim excerpts) - see §3.2

Raw string passed in

full_content

Copy of the unprocessed string passed in.

Raw string passed in

extra

ReadmeExtract object containing all interpreted/extracted information from frontmatter.


3. Content Extraction Rules

3.1 First Paragraph Extraction

Definition: “First paragraph” is the first non-empty, non-heading part of a README with text. It should also include the last of each increasing level headers leading to the text if present (skipping non-existent). If the readme is empty, “first paragraph” is also an empty string.

Processing Steps:

  1. Skip YAML frontmatter (lines between --- markers)

  2. Find the first non-empty line that is not a heading (doesn’t start with #)

  3. Include all heading lines that form a path to this text, starting from the highest level

  4. Include the text paragraph itself

  5. Preserve original formatting and line breaks

Examples:

# Project Title
## Section

### Subsection to be ignored, as it is not in the direct Path

### Subsection

This is the first paragraph.

Result: "# Project Title\n## Section\n\n### Subsection\n\nThis is the first paragraph."

# Project Title
This is the first paragraph.

Result: "# Project Title\nThis is the first paragraph."

3.2 TODO Section Extraction

Definition: Extract verbatim sections containing TODO-related keywords as complete markdown blocks.

Processing Steps:

  1. Use keywords from settings.todo_keywords (see Configuration Settings Layer)

  2. Find sections with headings containing one of these keywords (case-insensitive). This means the line begins with at least one ‘#’ with the number of ‘#’ being the level of the heading.

  3. Extract complete sections including headings and all content until the next heading of equal or higher level (as defined in 2.)

  4. Preserve original markdown formatting

  5. If multiple sections match, concatenate them with double newlines

Examples:

## TODO
- [x] Setup project
- [ ] Add docs

## Other Section
Content here.

## Tasks
- [ ] More work

Result: "## TODO\n- [x] Setup project\n- [ ] Add docs\n\n## Tasks\n- [ ] More work"


4. Derived Metadata Extraction

When a README contains YAML front-matter, it is extracted to an extra field (ReadmeExtract object). See spec for details.


5. Error Handling

Principle: Maximum Effort. Continue after logging/replacing with defaults.

5.1 Missing Data (Graceful Handling)

All Log entries mentioned here have WARNING-Level

  • Missing fields: Set to None or empty values, log message

  • Empty README: Set content to empty string, log message

  • No TODO sections: Set todo to None

  • No first paragraph: Set content to empty string, log message

5.2 Incorrect Data (Error Handling)

All Log entries mentioned here have ERROR-Level

  • Invalid JSON: Log error, raise mapping error

  • Invalid YAML frontmatter: Log error, continue with empty frontmatter

  • Type coercion failures: Log error, preserve original value

  • Malformed content: Log error, extract what’s possible

5.3 Rendering Behavior

  • Missing data: Render [!warning] callout with descriptive message

  • Incorrect data: Render [!danger] callout with error details


6. Non-Goals

  • This layer performs no network I/O.

  • Sorting, grouping, rendering are handled elsewhere.

  • The table follows all rules in Table Rendering & UI (m-dash placeholder, custom labels …).

  • The consumer applies sorting/grouping according to their own needs (see Table Sorting).

  • Consumes sorted & grouped data from Table Rendering & UI.

  • Reuses placeholders and cell formatting rules defined there.

  • Reads metadata extracted by the Model Mapping and enriched by the Data Collector.