Specification: ReadmeExtract Model
Purpose
Defines the ReadmeExtract model class responsible for extracting interpreted information from README frontmatter that are not verbatim parts of the original Markdown.
1. Class Definition
1.1 ReadmeExtract Model
class ReadmeExtract(BaseModel):
"""Extracted and interpreted information from README frontmatter."""
# Frontmatter-derived fields
authors: list[str] = Field(default_factory=list)
supervisors: list[str] = Field(default_factory=list)
# Raw frontmatter for reference
raw_frontmatter: dict[str, Any] = Field(default_factory=dict)
2. Field Definitions
2.1 Frontmatter-Derived Fields
Field |
Type |
Description |
Source |
|---|---|---|---|
|
list[str] |
All author names |
|
|
list[str] |
Author names with Supervision role |
|
|
dict |
Complete frontmatter for reference |
All YAML frontmatter |
4. Construction Process
Parse frontmatter: Extract YAML between
---markersProcess frontmatter fields: Apply processing to get extracted data
Construct object: Create ReadmeExtract with all extracted data
5. Integration with Readme Model
The Readme model should include:
class Readme(BaseModel):
# Readme-Fields (see spec_model_mapping.md or models/readme.py)
# [...]
# extra-field with the Extract (this spec)
extra: ReadmeExtract # All interpreted/extracted information from frontmatter
6. Error Handling
Invalid YAML: Log error, continue with empty frontmatter
Missing fields: Set to
None, “” or empty list as appropriateMalformed author data: Log warning, skip invalid entries, continue with valid ones
7. Examples
7.1 Simple README (No Frontmatter)
# My Project
This is the first paragraph of my project.
Extract:
All fields: “” or empty lists - depending on type; or
Noneif type allows it.raw_frontmatter:{}
7.2 README with Frontmatter
---
type: ML
priority: 5
authors:
- name: Alice Example
roles: [Supervision]
- Bob Example
---
# Alpha One
## About
This is the main README for Alpha One.
Extract:
authors: [“Alice Example”, “Bob Example”]supervisors: [“Alice Example”]raw_frontmatter:{"type": "ML", "priority": 5, "authors": [...]}
8. Non-Goals
Network I/O
File system operations
Rendering or formatting
Validation beyond basic type checking
Content extraction (first paragraph, TODO sections) - see spec_model_mapping.md