Files
llama.cpp/CODEOWNERS
T
Line: 4: incorrect codeowner user: ngxson Line: 4: no users/groups matched Line: 5: incorrect codeowner user: CISC Line: 5: no users/groups matched Line: 6: incorrect codeowner user: CISC Line: 6: no users/groups matched Line: 7: incorrect codeowner user: ggerganov Line: 7: no users/groups matched Line: 8: incorrect codeowner user: ggerganov Line: 8: no users/groups matched Line: 9: incorrect codeowner user: ggerganov Line: 9: no users/groups matched Line: 10: incorrect codeowner user: ggerganov Line: 10: no users/groups matched Line: 11: incorrect codeowner user: ggerganov Line: 11: no users/groups matched Line: 12: incorrect codeowner user: ggerganov Line: 12: no users/groups matched Line: 13: incorrect codeowner user: aldehir Line: 13: no users/groups matched Line: 14: incorrect codeowner user: ggerganov Line: 14: no users/groups matched Line: 15: incorrect codeowner user: ggerganov Line: 15: no users/groups matched Line: 16: incorrect codeowner user: angt Line: 16: no users/groups matched Line: 17: incorrect codeowner user: ggerganov Line: 17: no users/groups matched Line: 18: incorrect codeowner user: ggerganov Line: 18: no users/groups matched Line: 19: incorrect codeowner user: aldehir Line: 19: no users/groups matched Line: 20: incorrect codeowner user: ggerganov Line: 20: no users/groups matched Line: 21: incorrect codeowner user: ggerganov Line: 21: no users/groups matched Line: 22: incorrect codeowner user: aldehir Line: 22: no users/groups matched Line: 23: incorrect codeowner user: CISC Line: 23: no users/groups matched Line: 24: incorrect codeowner user: ggerganov Line: 24: no users/groups matched Line: 25: incorrect codeowner user: ggerganov Line: 25: no users/groups matched Line: 26: incorrect codeowner user: ggerganov Line: 26: no users/groups matched Line: 27: incorrect codeowner user: ggerganov Line: 27: no users/groups matched Line: 28: incorrect codeowner user: am17an Line: 28: no users/groups matched Line: 29: incorrect codeowner user: ggerganov Line: 29: no users/groups matched Line: 30: incorrect codeowner user: ggerganov Line: 30: no users/groups matched Line: 31: incorrect codeowner user: ggerganov Line: 31: no users/groups matched Line: 32: incorrect codeowner user: ggerganov Line: 32: no users/groups matched Line: 33: incorrect codeowner user: ggerganov Line: 33: no users/groups matched Line: 34: incorrect codeowner user: ggerganov Line: 34: no users/groups matched Line: 35: incorrect codeowner user: ggerganov Line: 35: no users/groups matched Line: 36: incorrect codeowner user: ggerganov Line: 36: no users/groups matched Line: 37: incorrect codeowner user: ggerganov Line: 37: no users/groups matched Line: 38: incorrect codeowner user: JohannesGaessler Line: 38: no users/groups matched Line: 39: incorrect codeowner user: danbev Line: 39: no users/groups matched Line: 40: incorrect codeowner user: ggerganov Line: 40: no users/groups matched Line: 41: incorrect codeowner user: ggerganov Line: 41: no users/groups matched Line: 42: incorrect codeowner user: ggerganov Line: 42: no users/groups matched Line: 43: incorrect codeowner user: ggerganov Line: 43: no users/groups matched Line: 44: incorrect codeowner user: ggerganov Line: 44: no users/groups matched Line: 45: incorrect codeowner user: ggerganov Line: 45: no users/groups matched Line: 46: incorrect codeowner user: ggerganov Line: 46: no users/groups matched Line: 47: incorrect codeowner user: ggerganov Line: 47: no users/groups matched Line: 48: incorrect codeowner user: ggerganov Line: 48: no users/groups matched Line: 49: incorrect codeowner user: ggerganov Line: 49: no users/groups matched Line: 50: incorrect codeowner user: alex-spacemit Line: 50: no users/groups matched Line: 51: incorrect codeowner user: JohannesGaessler Line: 51: no users/groups matched Line: 52: incorrect codeowner user: JohannesGaessler Line: 52: incorrect codeowner user: am17an Line: 52: no users/groups matched Line: 53: incorrect codeowner user: JohannesGaessler Line: 53: no users/groups matched Line: 54: incorrect codeowner user: JohannesGaessler Line: 54: no users/groups matched Line: 55: incorrect codeowner user: JohannesGaessler Line: 55: no users/groups matched Line: 56: incorrect codeowner user: IMbackK Line: 56: no users/groups matched Line: 57: incorrect codeowner user: IMbackK Line: 57: no users/groups matched Line: 58: incorrect codeowner user: IMbackK Line: 58: no users/groups matched Line: 59: incorrect codeowner user: ggerganov Line: 59: no users/groups matched Line: 60: incorrect codeowner user: ggerganov Line: 60: no users/groups matched Line: 61: incorrect codeowner user: lhez Line: 61: incorrect codeowner user: max-krasnyansky Line: 61: no users/groups matched Line: 62: incorrect codeowner user: max-krasnyansky Line: 62: incorrect codeowner user: lhez Line: 62: no users/groups matched Line: 63: incorrect codeowner user: JohannesGaessler Line: 63: no users/groups matched Line: 64: incorrect codeowner user: ggerganov Line: 64: no users/groups matched Line: 65: incorrect codeowner user: rgerganov Line: 65: no users/groups matched Line: 66: incorrect codeowner user: ggerganov Line: 66: no users/groups matched Line: 67: incorrect codeowner user: 0cc4m Line: 67: no users/groups matched Line: 68: incorrect codeowner user: reeselevine Line: 68: no users/groups matched Line: 69: incorrect codeowner user: taronaeo Line: 69: incorrect codeowner user: Andreas-Krebbel Line: 69: incorrect codeowner user: AlekseiNikiforovIBM Line: 69: no users/groups matched Line: 70: incorrect codeowner user: ggerganov Line: 70: no users/groups matched Line: 71: incorrect codeowner user: ggerganov Line: 71: no users/groups matched Line: 72: incorrect codeowner user: JohannesGaessler Line: 72: incorrect codeowner user: Green-Sky Line: 72: no users/groups matched Line: 73: incorrect codeowner user: CISC Line: 73: no users/groups matched Line: 74: incorrect codeowner user: ggerganov Line: 74: no users/groups matched Line: 75: incorrect codeowner user: ggerganov Line: 75: no users/groups matched Line: 76: incorrect codeowner user: ggerganov Line: 76: no users/groups matched Line: 77: incorrect codeowner user: ggerganov Line: 77: no users/groups matched Line: 78: incorrect codeowner user: ggerganov Line: 78: no users/groups matched Line: 79: incorrect codeowner user: CISC Line: 79: no users/groups matched Line: 80: incorrect codeowner user: CISC Line: 80: no users/groups matched Line: 81: incorrect codeowner user: ngxson Line: 81: no users/groups matched Line: 82: incorrect codeowner user: CISC Line: 82: no users/groups matched Line: 83: incorrect codeowner user: CISC Line: 83: no users/groups matched Line: 84: incorrect codeowner user: CISC Line: 84: no users/groups matched Line: 85: incorrect codeowner user: CISC Line: 85: no users/groups matched Line: 86: incorrect codeowner user: ggerganov Line: 86: no users/groups matched Line: 87: incorrect codeowner user: ggerganov Line: 87: no users/groups matched Line: 88: incorrect codeowner user: ggerganov Line: 88: no users/groups matched Line: 89: incorrect codeowner user: ngxson Line: 89: no users/groups matched Line: 90: incorrect codeowner user: ggerganov Line: 90: no users/groups matched Line: 91: incorrect codeowner user: ggerganov Line: 91: no users/groups matched Line: 92: incorrect codeowner user: rgerganov Line: 92: no users/groups matched Line: 93: incorrect codeowner user: ngxson Line: 93: incorrect codeowner user: ggerganov Line: 93: no users/groups matched Line: 94: incorrect codeowner user: allozaur Line: 94: no users/groups matched Line: 95: incorrect codeowner user: ggerganov Line: 95: no users/groups matched Line: 96: incorrect codeowner user: ggerganov Line: 96: no users/groups matched Line: 97: incorrect codeowner user: ggerganov Line: 97: no users/groups matched Line: 98: incorrect codeowner user: ggerganov Line: 98: no users/groups matched Line: 99: incorrect codeowner user: ggerganov Line: 99: no users/groups matched Line: 100: incorrect codeowner user: ggerganov Line: 100: no users/groups matched Line: 101: incorrect codeowner user: ggerganov Line: 101: no users/groups matched Line: 102: incorrect codeowner user: ggerganov Line: 102: no users/groups matched Line: 103: incorrect codeowner user: ggerganov Line: 103: no users/groups matched Line: 104: incorrect codeowner user: danbev Line: 104: no users/groups matched Line: 105: incorrect codeowner user: CISC Line: 105: no users/groups matched
Aldehir Rojas 0a8026e768 common : introduce composable PEG parser combinators for chat parsing (#17136)
* common : implement parser combinators to simplify chat parsing

* add virtual destructor to parser_base

* fix memory leak from circular references of rules

* implement gbnf grammar building

* remove unused private variable

* create a base visitor and implement id assignment as a visitor

* fix const ref for grammar builder

* clean up types, friend classes, and class declarations

* remove builder usage from until_parser

* Use a counter class to help assign rule ids

* cache everything

* add short description for each parser

* create a type for the root parser

* implement repetition parser

* Make optional, one_or_more, and zero_or_more subclasses of repetition

* improve context constructor

* improve until parsing and add benchmarks

* remove cached() pattern, cache in parser_base with specialized parsing functions for each parser

* improve json parsing performance to better match legacy parsing

* fix const auto * it for windows

* move id assignment to classes instead of using a visitor

* create named rules in the command r7b example

* use '.' for any in GBNF

* fix parens around choices in gbnf grammar

* add convenience operators to turn strings to literals

* add free-form operators for const char * to simplify defining literals

* simplify test case parser

* implement semantic actions

* remove groups in favor of actions and a scratchpad

* add built in actions for common operations

* add actions to command r7b example

* use std::default_searcher for platforms that don't have bm

* improve parser_type handling and add cast helper

* add partial result type to better control when to run actions

* fix bug in until()

* run actions on partial results by default

* use common_chat_msg for result

* add qwen3 example wip

* trash partial idea and simplify

* move action arguments to a struct

* implement aho-corasick matcher for until_parser and to build exclusion grammars

* use std::string for input, since std::string_view is incompatible with std::regex

* Refactor tests

* improve qwen3 example

* implement sax-style parsing and refactor

* fix json string in test

* rename classes to use common_chat_ prefix

* remove is_ suffix from functions

* rename from id_counter to just counter

* Final refactored tests

* Fix executable name and editorconfig-checker

* Third time's the charm...

* add trigger parser to begin lazy grammar rule generation

* working lazy grammar

* refactor json rules now that we check for reachability

* reduce pointer usage

* print out grammars in example

* rename to chat-peg-parser* and common_chat_peg_parser*

* Revert unrelated changes

* New macros for CMakeLists to enable multi-file compilations

* starting unicode support

* add unicode support to char_parser

* use unparsed args as additional sources

* Refactor tests to new harness

* Fix CMakeLists

* fix rate calculation

* add unicode tests

* fix trailing whitespace and line endings

skip-checks: true

* Helpers + rewrite qwen3 with helpers

* Fix whitespace

* extract unicode functions to separate file

* refactor parse unicode function

* fix compiler error

* improve construction of sequence/choice parsers

* be less clever

* add make_parser helper function

* expand usage of make_parser, alias common_chat_msg_peg_parser_builder to builder in source

* lower bench iterations

* add unicode support to until_parser

* add unicode support to json_string_parser

* clean up unicode tests

* reduce unicode details to match src/unicode.cpp

* simplify even further

* remove unused functions

* fix type

* reformat char class parsing

* clean up json string parser

* clean up + fix diagnostics

* reorder includes

* compact builder functions

* replace action_parser with capture_parser, rename env to semantics

* rename env to semantics

* clean up common_chat_parse_context

* move type() to below constant

* use default constructor for common_chat_peg_parser

* make all operators functions for consistency

* fix compilation errors in test-optional.cpp

* simplify result values

* rename json_string_unquoted to json_string_content

* Move helper to separate class, add separate explicit and helper classes

* Whitespace

* Change + to append()

* Reformat

* Add extra helpers, tests and Minimax example

* Add some extra optional debugging prints + real example of how to use them

* fix bug in repetitions when min_count = 0 reports failures

* dump rule in debug

* fix token accumulation and assert parsing never fails

* indent debug by depth

* use LOG_* in tests so logs sync up with test logs

* - Add selective testing
- Refactor all messaging to use LOG_ERR
- Fix lack of argument / tool name capturing
- Temporary fix for double event capture

* refactor rule() and introduce ref()

* clean up visitor

* clean up indirection in root parser w.r.t rules

* store shared ptr directly in parser classes

* replace aho-corasick automation with a simple trie

* Reset prev for qwen3 helper example variant

* refactor to use value semantics with std::variant/std::visit

* simplify trie_matcher result

* fix linting issues

* add annotations to rules

* revert test workaround

* implement serializing the parser

* remove redundant parsers

* remove tests

* gbnf generation fixes

* remove LOG_* use in tests

* update gbnf tests to test entire grammar

* clean up gbnf generation and fix a few bugs

* fix typo in test output

* remove implicit conversion rules

* improve test output

* rename trie_matcher to trie

* simplify trie to just know if a node is the end of a word

* remove common_chat_ prefix and ensure a common_peg_ prefix to all types

* rename chat-peg-parser -> peg-parser

* promote chat-peg-parser-helper to chat-peg-parser

* checkpoint

* use a static_assert to ensure we handle every branch

* inline trivial peg parser builders

* use json strings for now

* implement basic and native chat peg parser builders/extractors

* resolve refs to their rules

* remove packrat caching (for now)

* update tests

* compare parsers with incremental input

* benchmark both complete and incremental parsing

* add raw string generation from json schema

* add support for string schemas in gbnf generation

* fix qwen example to include \n

* tidy up example

* rename extractor to mapper

* rename ast_arena to ast

* place basic tests into one

* use gbnf_format_literal from json-schema-to-grammar

* integrate parser with common/chat and server

* clean up schema and serialization

* add json-schema raw string tests

* clean up json creation and remove capture parser

* trim spaces from reasoning and content

* clean up redundant rules and comments

* rename input_is_complete to is_partial to match rest of project

* simplify json rules

* remove extraneous file

* remove comment

* implement += and |= operators

* add comments to qwen3 implementation

* reorder arguments to common_chat_peg_parse

* remove commented outdated tests

* add explicit copy constructor

* fix operators and constness

* wip: update test-chat for qwen3-coder

* bring json parser closer to json-schema-to-grammar rules

* trim trailing space for most things

* fix qwen3 coder rules w.r.t. trailing spaces

* group rules

* do not trim trailing space from string args

* tweak spacing of qwen3 grammar

* update qwen3-coder tests

* qwen3-coder small fixes

* place parser in common_chat_syntax to simplify invocation

* use std::set to collect rules to keep order predictable for tests

* initialize parser to make certain platforms happy

* revert back to std::unordered_set, sort rule names at the end instead

* uncomment rest of chat tests

* define explicit default constructor

* improve arena init and server integration

* fix chat test

* add json_member()

* add a comprehensive native example

* clean up example qwen test and add response_format example to native test

* make build_peg_parser accept std::function instead of template

* change peg parser parameters into const ref

* push tool call on tool open for constructed parser

* add parsing documentation

* clean up some comments

* add json schema support to qwen3-coder

* add id initializer in tests

* remove grammar debug line from qwen3-coder

* refactor qwen3-coder to use sequence over operators

* only call common_chat_peg_parse if appropriate format

* simplify qwen3-coder space handling

* revert qwen3-coder implementation

* revert json-schema-to-grammar changes

* remove unnecessary forward declaration

* small adjustment to until_parser

* rename C/C++ files to use dashes

* codeowners : add aldehir to peg-parser and related files

---------

Co-authored-by: Piotr Wilkin <piotr.wilkin@syndatis.com>
2025-12-03 12:45:32 +02:00

106 lines
5.3 KiB
Plaintext

# collaborators can optionally add themselves here to indicate their availability for reviewing related PRs
# multiplie collaborators per item can be specified
/.devops/*.Dockerfile @ngxson
/.github/actions/ @CISC
/.github/workflows/ @CISC
/ci/ @ggerganov
/cmake/ @ggerganov
/common/CMakeLists.txt @ggerganov
/common/arg.* @ggerganov
/common/base64.hpp.* @ggerganov
/common/build-info.* @ggerganov
/common/chat-peg-parser.* @aldehir
/common/common.* @ggerganov
/common/console.* @ggerganov
/common/http.* @angt
/common/llguidance.* @ggerganov
/common/log.* @ggerganov
/common/peg-parser.* @aldehir
/common/sampling.* @ggerganov
/common/speculative.* @ggerganov
/common/unicode.* @aldehir
/convert_*.py @CISC
/examples/batched.swift/ @ggerganov
/examples/batched/ @ggerganov
/examples/convert-llama2c-to-ggml/ @ggerganov
/examples/deprecation-warning/ @ggerganov
/examples/diffusion/ @am17an
/examples/embedding/ @ggerganov
/examples/eval-callback/ @ggerganov
/examples/export-docs/ @ggerganov
/examples/gen-docs/ @ggerganov
/examples/gguf/ @ggerganov
/examples/llama.android/ @ggerganov
/examples/llama.swiftui/ @ggerganov
/examples/llama.vim @ggerganov
/examples/lookahead/ @ggerganov
/examples/lookup/ @JohannesGaessler
/examples/model-conversion/ @danbev
/examples/parallel/ @ggerganov
/examples/passkey/ @ggerganov
/examples/retrieval/ @ggerganov
/examples/save-load-state/ @ggerganov
/examples/speculative-simple/ @ggerganov
/examples/speculative/ @ggerganov
/ggml/cmake/ @ggerganov
/ggml/include/ @ggerganov
/ggml/src/ggml-common.h @ggerganov
/ggml/src/ggml-cpu/ @ggerganov
/ggml/src/ggml-cpu/spacemit/ @alex-spacemit
/ggml/src/ggml-cuda/fattn* @JohannesGaessler
/ggml/src/ggml-cuda/mmf.* @JohannesGaessler @am17an
/ggml/src/ggml-cuda/mmq.* @JohannesGaessler
/ggml/src/ggml-cuda/mmvf.* @JohannesGaessler
/ggml/src/ggml-cuda/mmvq.* @JohannesGaessler
/ggml/src/ggml-cuda/fattn-wmma* @IMbackK
/ggml/src/ggml-hip/ @IMbackK
/ggml/src/ggml-cuda/vendors/hip.h @IMbackK
/ggml/src/ggml-impl.h @ggerganov
/ggml/src/ggml-metal/ @ggerganov
/ggml/src/ggml-opencl/ @lhez @max-krasnyansky
/ggml/src/ggml-hexagon/ @max-krasnyansky @lhez
/ggml/src/ggml-opt.cpp @JohannesGaessler
/ggml/src/ggml-quants.* @ggerganov
/ggml/src/ggml-rpc/ @rgerganov
/ggml/src/ggml-threading.* @ggerganov
/ggml/src/ggml-vulkan/ @0cc4m
/ggml/src/ggml-webgpu/ @reeselevine
/ggml/src/ggml-zdnn/ @taronaeo @Andreas-Krebbel @AlekseiNikiforovIBM
/ggml/src/ggml.c @ggerganov
/ggml/src/ggml.cpp @ggerganov
/ggml/src/gguf.cpp @JohannesGaessler @Green-Sky
/gguf-py/ @CISC
/media/ @ggerganov
/scripts/gen* @ggerganov
/scripts/get* @ggerganov
/scripts/sync* @ggerganov
/src/ @ggerganov
/src/llama-adapter.* @CISC
/src/llama-arch.* @CISC
/src/llama-chat.* @ngxson
/src/llama-graph.* @CISC
/src/llama-model.* @CISC
/src/llama-vocab.* @CISC
/src/models/ @CISC
/tests/ @ggerganov
/tools/batched-bench/ @ggerganov
/tools/main/ @ggerganov
/tools/mtmd/ @ngxson
/tools/perplexity/ @ggerganov
/tools/quantize/ @ggerganov
/tools/rpc/ @rgerganov
/tools/server/* @ngxson @ggerganov # no subdir
/tools/server/webui/ @allozaur
/tools/tokenize/ @ggerganov
/tools/tts/ @ggerganov
/vendor/ @ggerganov
/AUTHORS @ggerganov
/CMakeLists.txt @ggerganov
/CONTRIBUTING.md @ggerganov
/LICENSE @ggerganov
/README.md @ggerganov
/SECURITY.md @ggerganov
/build-xcframework.sh @danbev
requirements*.txt @CISC