50 Commits

Author SHA1 Message Date
fac3d45817 refactor: remove templated sample file
• Deleted templated_sample.cpp
 • This file contained an example template usage and is no longer needed.
2025-12-05 11:07:28 -03:00
5964f16877 feat: Add reST docstring renderer support
• Introduce REST_DOCSTRING style in DocumentationStyle.
 • Implement ReSTDocstringRenderer for reStructuredText output.
 • Update renderers (Doxygen, Google, Numpy) to use examples instead of
example field consistently.
 • Add exceptions field handling to GoogleDocstringRenderer,
NumpyDocstringRenderer and DoxygenRenderer.
 • Update LLM constraints to refine example generation logic.
 • Map new style to the corresponding renderer.
2025-12-05 07:56:30 -03:00
a814502bbb fix: returning tuple on _get_node_to_document abstract 2025-12-04 16:47:49 -03:00
64c77d7026 refactor: Improve docstring rendering consistency
• Add explicit docstring section headers in Google style.
 • Ensure blank lines separate sections in Google style.
 • Standardize entity kind retrieval using str().
 • Add support for rendering example in Numpy style.
 • Improve return type parsing in Numpy style.
2025-12-04 16:45:42 -03:00
f2d654ed76 fix: Change _is_node_namespace to return the node or None instead of a boolean. 2025-12-04 16:42:58 -03:00
f6a5c606e4 refactor: Improve DocAI constants and Doxygen renderer
• Updated DocumentationStyle and ProgrammingLanguage enums in constants.py to use explicit string values instead of auto().
 • Introduced CodeEntityKind as a str, Enum and added GENERIC and UNION kinds.
 • Refactored DoxygenRenderer to use CodeEntityKind for mapping and added support for documenting generics, unions, and examples (
2025-12-04 16:39:28 -03:00
37325894f7 refactor: Update parsers to handle skip child traversal
• Modify _get_node_to_document in parsers to return a tuple: (Node | None, bool).
 • The boolean indicates if child nodes should be skipped during traversal.
 • Update BaseParser traversal logic to respect the new return value.
 • Enhance CParser and CPPParser to correctly identify nodes like template declarations where skipping children is necessary.
2025-12-04 16:36:47 -03:00
f260826985 refactor: Update parsers to use _get_node_to_document
Refactor parsers to use the new _get_node_to_document method instead of _node_is_match.

This aligns with changes in BaseParser to return the specific node to document, allowing for more flexible entity extraction.
2025-12-04 15:23:07 -03:00
f543c2b6ab refactor: Improve descriptions in EntityDoc model
• Updated descriptions for entity_kind, description, members, and return_info fields in EntityDoc.
 • Descriptions are now more precise regarding renderer usage and content expectations.
2025-12-04 14:40:50 -03:00
d66f3d81c4 refactor: Adjust logging levels for initialization and success messages
• Changed several logger.info calls to logger.debug.
 • Updated one log message format in FileDocumenter.

This reduces noise in standard output by moving initialization and successful completion messages to the debug level.
2025-12-04 14:40:44 -03:00
fc8dda9ceb refactor: Centralize AST traversal logic
• Moved entity extraction logic from PythonParser to BaseParser.
 • Implemented generic AST traversal using TreeCursor in BaseParser.extract_entities.
 • Added specific matching helpers (_node_is_match, etc.) to CParser for composite types and functions.
 • Removed redundant traversal code from PythonParser.
2025-12-04 14:40:25 -03:00
81f243a04f refactor: Remove redundant comment/doc removal logic
• Removed remove_comments_and_docs and normalize_lines from BaseDocWriter.
 • Deleted corresponding methods from CCppDocWriter and PythonDocWriter.
 • Simplified BaseDocWriter.__enter__ and removed file reading logic.
 • Removed _find_entity_text_position as it relied on removed methods.

This cleans up base class methods that were duplicated or no longer necessary after changes in related logic.
2025-12-04 11:58:06 -03:00
1ccd75201e docs: Enhance LLM system prompt for documentation generation
• Added instructions to prioritize existing comments/documentation.
 • Clarified the role and required structure for the LLM output.
 • Removed unnecessary debug logging.
2025-12-04 11:38:56 -03:00
7db84fe764 refactor: Update entity extraction logic in parsers
• Replaced query-based entity extraction with AST traversal in BaseParser.
 • FileDocumenter now calls extract_entities() instead of parse().
 • PythonParser implements AST traversal logic to find definitions.
 • Removed dependency on Query and QueryCursor from BaseParser.
2025-12-04 11:32:12 -03:00
ba12d53781 fix: updating config-example.toml 2025-12-04 09:00:33 -03:00
046ed37392 refactor: Add comment removal utility to C++ writer
• Implements remove_comments_and_docs method.
 • Adds normalize_lines for stripping whitespace and expanding tabs.
 • Prepares the writer for further code processing by cleaning input.
2025-12-04 08:56:11 -03:00
ae3715784a refactor: Update entity tracking from line number to position
• Replace line_no with text_pos (row, column) in EntitySource.
 • Update parsers to use start point row/column for entity tracking.
 • Adjust writers (BaseDocWriter, CCppDocWriter, PythonDocWriter) to use the new position tuple.
 • Introduce methods in PythonDocWriter to strip comments/docs and normalize lines.
 • Update BaseDocWriter to preprocess file content upon loading.
2025-12-02 18:49:50 -03:00
eb5f2619e3 refactor: Rename SourceEntity to EntitySource and update usage
• Renamed SourceEntity to EntitySource across modules.
 • Replaced Tree Sitter ID with line_no in EntitySource.
 • Updated FileDocumenter to remove commented-out code.
 • Modified BaseParser to sort entities by line number descending after parsing.
 • Updated type hints in LLM and BaseDocWriter.
2025-12-02 17:49:29 -03:00
796dbf4d5d refactor: Clean up whitespace and add newlines
• Added missing blank lines in several modules (__main__.py, cli.py, config.py, constants.py, models.py, python_parser.py).
 • Improved formatting in parser.py and writer.py for better readability.
2025-12-02 16:58:35 -03:00
5938236119 refactor: Centralize constants and mappings
• Moved DocumentationStyle, ProgrammingLanguage, and CodeEntityKind to constants.py.
 • Extracted renderer mappings to mappings/documentation_renderers.py.
 • Extracted file extension handlers to mappings/extension_handlers.py.
 • Renamed and moved tree-sitter mapping to mappings/tree_sitter_mappings.py.
 • Updated imports across modules to use the new centralized mappings.
 • Replaced term "node" with "entity" in several places for consistency.
2025-12-02 16:57:25 -03:00
edfc3ccb22 refactor: Move treesitter mapping to dedicated module
• Extracted TS_NODE_KIND_MAP and related maps from __init__.py.
 • Created treesitter_mapping.py to house these definitions.
 • Updated parser.py and __init__.py to import from the new location.
2025-12-02 16:37:11 -03:00
24578f8e6d refactor: Standardize logger names and update file handling
• Renamed loggers in cli.py, config.py, and file_documenter.py for consistency.
 • Consolidated file parsing and writing logic in FileDocumenter.
 • Moved TS_NODE_KIND_MAP definition to parsers/__init__.py.
 • Removed redundant language/writer mapping in FileDocumenter.
2025-12-02 16:34:10 -03:00
5a5166a794 refactor: Extract file processing logic to FileDocumenter
• Moved documentation generation logic from cli.py to a new FileDocumenter class.
 • Updated cli.main to use FileDocumenter for processing files.
 • Cleaned up imports and removed unused variables/functions in cli.py.
 • Moved STYLE_TO_RENDERER definition to docai.doc_renderers.__init__.py.
2025-12-02 16:16:57 -03:00
b5e57da2e5 refactor: Decouple documentation writing logic
• Extracted doc writing logic from cli.py into dedicated writers.
 • Introduced BaseDocWriter and language-specific writers (PythonDocWriter, CCppDocWriter).
 • Updated CLI to use writers based on detected language.
 • Renamed base_parser.py to parser.py and updated imports.
 • Added necessary imports for renderers and parsers in cli.py.
2025-12-02 15:47:04 -03:00
d9f818483c refactor: Centralize configuration and parser definitions
• Moved CONFIG_PATHS and SUFFIX_TO_PARSER from constants.py to cli.py.
 • Removed deprecated constants.py.
 • Updated cli.py to use xdg_config_home for config path discovery.
 • Refactored config.py to define STYLE_TO_RENDERER internally and handle config loading errors more strictly.
 • Updated llm.py to use SourceEntity.kind instead of SourceEntity.type.
 • Introduced CodeEntityKind mapping in base_parser.py to translate tree-sitter node types to CodeEntityKind.
 • Updated SourceEntity to use CodeEntityKind.
 • Moved renderer selection logic in cli.py to happen before LLM call.
2025-12-02 15:04:16 -03:00
4e4fa68ce3 refactor(docai): Decouple documentation rendering from CLI
• Replaced hardcoded GoogleDocstringRenderer usage in cli.py with dynamic renderer selection based on configuration.
 • Introduced DocumentationStyle enum and STYLE_TO_RENDERER mapping in constants.py.
 • Updated Config to load documentation styles mapping to DocumentationStyle enums.
 • Removed example/requirement fetching from LLM prompt in llm.py as style is now configured.
 • Added NumpyDocstringRenderer implementation.

This centralizes documentation style configuration and allows supporting multiple output formats dynamically.
2025-12-02 14:46:37 -03:00
a212614ca6 refactor: Update documentation generation structure
• Rename CodeNode to SourceEntity and CodeNodeLang to
ProgrammingLanguage.
 • Introduce EntityDoc and MemberDoc for structured documentation
output.
 • Update CLI to use EntityDoc and render using GoogleDocstringRenderer
by default.
 • Implement base Renderer class and specific renderers for Google and
Doxygen styles.
 • Update Config and DocWriter to use the new model and language enums.
2025-12-02 14:09:01 -03:00
5e13fd2258 refactor(models): Consolidate code node and doc models
• Moved CodeNode and CodeNodeLang from code_node.py to models.py.
 • Introduced DocModel in models.py for LLM output structure.
 • Updated imports across the project to reference docai.models.
 • Removed debugging print statements in cli.py.
 • Updated CodeNode initialization in base_parser.py to use keyword arguments.
2025-12-02 12:36:14 -03:00
0c7dd704cb Merge pull request 'refactor: replace CodeParser with modular parser system' (#1) from feat/better_parser into main
Reviewed-on: #1
2025-12-02 11:36:35 -03:00
a21cccf048 refactor: replace CodeParser with modular parser system
- Remove CodeParser class and integrate BaseParser-based parsers via
SUFFIX_TO_PARSER
 - Simplify CodeNode.type from enum to str
 - Update CLI to use new generate_doc function with file-specific
parsers
 - Add docstring to CCppDocStrategy.get_insert_position with usage
warning
2025-12-02 11:35:25 -03:00
cb3ea98890 refactor: migrate to pyproject.toml and update CLI entry point
• Move dependencies from requirements.txt to pyproject.toml with static
listing
 • Add project.scripts entry for docai CLI app
 • Update main.py to use docai.cli:app instead of main.py
 • Remove obsolete main.py and requirements.txt files

This modernizes packaging and simplifies the CLI setup.
2025-12-01 16:18:15 -03:00
1ed786e7ad feat: add comprehensive logging system and enhance code documentation
• Introduce configurable logging with --log-level option in main.py
 • Refactor logger initialization into a dedicated function
 • Add detailed logging statements across CodeParser, LLM, and main
modules
 • Enhance docstrings for CodeNodeLang, CodeNodeType, and CodeNode
classes
 • Improve error handling and user feedback in parsing and LLM
generation
 • Minor formatting fixes in doc_writer.py and remove interactive pause
in main.py
2025-12-01 15:46:55 -03:00
641b9d71b4 fix: ignoring config.toml and adding config-example.toml 2025-12-01 15:15:35 -03:00
12a3ab90d4 fix: Refactor docai to remove text_point dependency and enable doc writing
- Add LLM config section in config.toml with base_url, api_key, model
 - Rename config sections to [language.*] for consistency
 - Remove TextPoint class and text_point from CodeNode and CodeParser
 - Implement _find_node_text_position in DocWriter for position
detection via text matching
 - Update LLM.generate_doc to return DocModel | None and handle errors
 - Modify main.py to write generated docs to files with user
confirmation
2025-12-01 15:13:50 -03:00
356163d2b0 feat: working for C, Python and CPP 2025-12-01 14:36:03 -03:00
7bdbf0f11f feat: basic document writing 2025-12-01 11:36:16 -03:00
4f097dec4c fix: making python 3.13 to silence warnings 2025-12-01 11:07:29 -03:00
6640057428 feat: adding proper logger 2025-12-01 10:57:39 -03:00
f35bce1905 feat: adding config search with XDG folders 2025-12-01 10:19:05 -03:00
3b6bd1e06e feat: allowing output_requirements in config.toml 2025-11-29 09:12:49 -03:00
5d29362f30 chore: formatting 2025-11-28 18:56:40 -03:00
d62153d751 feat: basic working documenter with write on file 2025-11-28 18:54:47 -03:00
f2f8934b38 feat: properly instructing LLM via StructuredOutput to output only code 2025-11-28 16:22:38 -03:00
04235837d6 feat: better CodeParser init logic 2025-11-28 15:26:32 -03:00
b46fd7fd8f feat: improving doc configuration 2025-11-28 11:55:50 -03:00
03a7def5a1 feat: generating basic docs 2025-11-27 17:04:19 -03:00
62508671bc fix: adjusting Type Hints 2025-11-27 15:35:26 -03:00
fb23d4befc feat: basic python parser 2025-11-27 08:53:18 -03:00
4640471239 chore: renaming to docai 2025-11-27 07:56:07 -03:00
04f9f1594b Initial commit 2025-11-27 07:54:22 -03:00