--- title: Model Mapping Layer --- # Specification: Model Mapping Layer ## Purpose Defines how raw JSON or text retrieved from GitLab is converted into in-memory domain objects (group, project, issue, readme) that are then consumed by the [Data Collector](./spec_data_collector.md) and downstream renderers. --- ## 1. General Mapping Rules * Ignore unknown fields present in source JSON; forward compatibility is preferred. * Mandatory target attributes that are missing in source → raise mapping error unless explicitly documented below as optional. * Optional attributes absent in source become `null` / language-specific `None`. * Date-time strings ending in `Z` (UTC) or explicit offset must be parsed into a timezone-aware datetime object using UTC. * **Error Handling**: Missing data should be handled gracefully with logger warnings and render in `[!warning]`-callouts. Incorrect data should cause logger errors and render `[!danger]` callouts when rendered in either [markdown](./spec_renderer_markdown.md) or [quarto](./spec_renderer_quarto.md). --- ## 2. Entity-Specific Details ### 2.1 Group | Source JSON Field | Target Attribute | Notes | |-------------------|------------------|-------| | `id` *(string/int)* | id (string) | Preserve as string for consistency. | | `name` | name | | | *others* | *ignored* | | ### 2.2 Project | Source Field | Target Attr. | Notes | |--------------|-------------|-------| | `id` | id (int) | | | `name` | name | | | `path_with_namespace` | path_with_namespace | Optional | | `default_branch` | default_branch | Optional | | `last_activity_at` | last_activity_at (datetime) | Parse ISO timestamp to UTC (§1). | | `namespace` | namespace (Group) | Try to parse as Group-object or None. | | *others* | *ignored* | | ### 2.3 Issue | Source Field | Target Attr. | Notes | |--------------|-------------|-------| | `id` | id (int) | Must be interpretable as int, throw error if not possible. | | `iid` | iid (int) | Must be interpretable as int, throw error if not possible. | | `project_id` | project_id (int) | Must be interpretable as int, throw error if not possible. | | `title` | title | | | `state` | state | Optioal | | `web_url` | web_url | Optional | | `description` | description | Optional | | *others* | *ignored* | | ### 2.4 Readme | Source | Target Attr. | Notes | |--------|-------------|-------| | Raw string passed in | content | First paragraph (verbatim excerpt) - see §3.1 | | Raw string passed in | todo | TODO sections (verbatim excerpts) - see §3.2 | | Raw string passed in | full_content | Copy of the unprocessed string passed in. | | Raw string passed in | extra | ReadmeExtract object containing all interpreted/extracted information from frontmatter. | --- ## 3. Content Extraction Rules ### 3.1 First Paragraph Extraction **Definition**: "First paragraph" is the first non-empty, non-heading part of a README with text. It should also include the last of each increasing level headers leading to the text if present (skipping non-existent). If the readme is empty, "first paragraph" is also an empty string. **Processing Steps**: 1. Skip YAML frontmatter (lines between `---` markers) 2. Find the first non-empty line that is not a heading (doesn't start with `#`) 3. Include all heading lines that form a path to this text, starting from the highest level 4. Include the text paragraph itself 5. Preserve original formatting and line breaks **Examples**: ```markdown # Project Title ## Section ### Subsection to be ignored, as it is not in the direct Path ### Subsection This is the first paragraph. ``` **Result**: `"# Project Title\n## Section\n\n### Subsection\n\nThis is the first paragraph."` ```markdown # Project Title This is the first paragraph. ``` **Result**: `"# Project Title\nThis is the first paragraph."` ### 3.2 TODO Section Extraction **Definition**: Extract verbatim sections containing TODO-related keywords as complete markdown blocks. **Processing Steps**: 1. Use keywords from `settings.todo_keywords` (see [Configuration Settings Layer](./spec_settings.md)) 2. Find sections with headings containing one of these keywords (case-insensitive). This means the line begins with at least one '#' with the number of '#' being the level of the heading. 3. Extract complete sections including headings and all content until the next heading of equal or higher level (as defined in 2.) 4. Preserve original markdown formatting 5. If multiple sections match, concatenate them with double newlines **Examples**: ```markdown ## TODO - [x] Setup project - [ ] Add docs ## Other Section Content here. ## Tasks - [ ] More work ``` **Result**: `"## TODO\n- [x] Setup project\n- [ ] Add docs\n\n## Tasks\n- [ ] More work"` --- ## 4. Derived Metadata Extraction When a README contains YAML front-matter, it is extracted to an `extra` field (ReadmeExtract object). See [spec](./spec_readme_extraction.md) for details. --- ## 5. Error Handling Principle: Maximum Effort. Continue after logging/replacing with defaults. ### 5.1 Missing Data (Graceful Handling) All Log entries mentioned here have WARNING-Level * **Missing fields**: Set to `None` or empty values, log message * **Empty README**: Set content to empty string, log message * **No TODO sections**: Set todo to `None` * **No first paragraph**: Set content to empty string, log message ### 5.2 Incorrect Data (Error Handling) All Log entries mentioned here have ERROR-Level * **Invalid JSON**: Log error, raise mapping error * **Invalid YAML frontmatter**: Log error, continue with empty frontmatter * **Type coercion failures**: Log error, preserve original value * **Malformed content**: Log error, extract what's possible ### 5.3 Rendering Behavior * **Missing data**: Render `[!warning]` callout with descriptive message * **Incorrect data**: Render `[!danger]` callout with error details --- ## 6. Non-Goals * This layer performs **no** network I/O. * Sorting, grouping, rendering are handled elsewhere. * The table follows all rules in [Table Rendering & UI](./spec_table_rendering_ui.md) (m-dash placeholder, custom labels …). * The consumer applies sorting/grouping according to their own needs (see [Table Sorting](./spec_table_sorting.md)). * Consumes sorted & grouped data from [Table Rendering & UI](./spec_table_rendering_ui.md). * Reuses placeholders and cell formatting rules defined there. * Reads metadata extracted by the [Model Mapping](./spec_model_mapping.md) and enriched by the [Data Collector](./spec_data_collector.md).