|
Pigweed
|
#include <token_database.h>
Classes | |
| class | Entries |
| struct | Entry |
| An entry in the token database. More... | |
| class | iterator |
Iterator for TokenDatabase values. More... | |
Public Types | |
| using | value_type = Entry |
| using | size_type = std::size_t |
| using | difference_type = std::ptrdiff_t |
| using | reference = value_type & |
| using | const_reference = const value_type & |
| using | pointer = const value_type * |
| using | const_pointer = const value_type * |
| using | const_iterator = iterator |
| using | reverse_iterator = std::reverse_iterator< iterator > |
| using | const_reverse_iterator = std::reverse_iterator< const_iterator > |
Public Member Functions | |
| constexpr | TokenDatabase () |
Creates a database with no data. ok() returns false. | |
| Entries | Find (uint32_t token) const |
Returns all entries associated with this token. This is O(n). | |
| constexpr size_type | size () const |
| Returns the total number of entries (unique token-string pairs). | |
| constexpr bool | ok () const |
| constexpr iterator | begin () const |
| Returns an iterator for the first token entry. | |
| constexpr iterator | end () const |
| Returns an iterator for one past the last token entry. | |
Static Public Member Functions | |
| template<typename ByteArray > | |
| static constexpr bool | IsValid (const ByteArray &bytes) |
| template<const auto & kDatabaseBytes> | |
| static constexpr TokenDatabase | Create () |
| template<typename ByteArray > | |
| static constexpr TokenDatabase | Create (const ByteArray &database_bytes) |
Static Public Attributes | |
| static constexpr uint32_t | kDateRemovedNever = 0xFFFFFFFF |
Reads entries from a v0 binary token string database. This class does not copy or modify the contents of the database.
The v0 token database has two significant shortcomings:
\0). If a string contains a \0, the database will not work correctly.A v0 binary token database is comprised of a 16-byte header followed by an array of 8-byte entries and a table of null-terminated strings. The header specifies the number of entries. Each entry contains information about a tokenized string: the token and removal date, if any. All fields are little- endian.
The token removal date is stored within an unsigned 32-bit integer. It is stored as <day> <month> <year>, where <day> and <month> are 1 byte each and <year> is two bytes. The fields are set to their maximum value (0xFF or 0xFFFF) if they are unset. With this format, dates may be compared naturally as unsigned integers.
embed:rst:leading-asterisk * ====== ==== ========================= * Header (16 bytes) * --------------------------------------- * Offset Size Field * ====== ==== ========================= * 0 6 Magic number (``TOKENS``) * 6 2 Version (``00 00``) * 8 4 Entry count * 12 4 Reserved * ====== ==== ========================= * * ====== ==== ================================== * Entry (8 bytes) * ------------------------------------------------ * Offset Size Field * ====== ==== ================================== * 0 4 Token * 4 1 Removal day (1-31, 255 if unset) * 5 1 Removal month (1-12, 255 if unset) * 6 2 Removal year (65535 if unset) * ====== ==== ================================== *
Entries are sorted by token. A string table with a null-terminated string for each entry in order follows the entries.
Entries are accessed by iterating over the database. A O(n) Find function is also provided. In typical use, a TokenDatabase is preprocessed by a pw::tokenizer::Detokenizer into a std::unordered_map.
|
inlinestaticconstexpr |
Creates a TokenDatabase and checks if the provided data is valid at compile time. Accepts references to constexpr containers (array, span, string_view, etc.) with static storage duration. For example:
|
inlinestaticconstexpr |
Creates a TokenDatabase from the provided byte array. The array may be a span, array, or other container type. If the data is not valid, returns a default-constructed database for which ok() is false.
Prefer the Create overload that takes the data as a template parameter when possible, since that overload verifies data integrity at compile time.
|
inlinestaticconstexpr |
Returns true if the provided data is a valid token database. This checks the magic number (TOKENS), version (which must be 0), and that there is is one string for each entry in the database. A database with extra strings or other trailing data is considered valid.
|
inlineconstexpr |
True if this database was constructed with valid data. The database might be empty, but it has an intact header and a string for each entry.
|
staticconstexpr |
Default date_removed for an entry in the token datase if it was never removed.