* feat: improve search tokenization for CJK languages
Enhance the encoder function to properly tokenize CJK (Chinese, Japanese,
Korean) characters while maintaining English word tokenization. This fixes
search issues where CJK text was not searchable due to whitespace-only
splitting.
Changes:
- Tokenize CJK characters (Hiragana, Katakana, Kanji, Hangul) individually
- Preserve whitespace-based tokenization for non-CJK text
- Support mixed CJK/English content in search queries
This addresses the CJK search issues reported in #2109 where Japanese text
like "て以来" was not searchable because the encoder only split on whitespace.
Tested with Japanese, Chinese, and Korean content to verify character-level
tokenization works correctly while maintaining English search functionality.
* perf: optimize CJK search encoder with manual buffer tracking
Replace regex-based tokenization with index-based buffer management.
This improves performance by ~2.93x according to benchmark results.
- Use explicit buffer start/end indices instead of string concatenation
- Replace split(/\s+/) with direct whitespace code point checks
- Remove redundant filter() operations
- Add CJK Extension A support (U+20000-U+2A6DF)
Performance: ~878ms → ~300ms (100 iterations, mixed CJK/English text)
* test: add comprehensive unit tests for CJK search encoder
Add 21 unit tests covering:
- English word tokenization
- CJK character-level tokenization (Japanese, Korean, Chinese)
- Mixed CJK/English content
- Edge cases
All tests pass, confirming the encoder correctly handles CJK text.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
---------
Co-authored-by: Claude <noreply@anthropic.com>
* Add rtl automatic detection to base.scss
* Implement RTL support for Arabic and Persian locales and update HTML direction attribute in renderPage component
* Update HTML direction attribute in renderPage component to prioritize frontmatter dir value
* Refactor renderPage component to simplify HTML direction attribute assignment by removing frontmatter dir fallback
* chore(deps): update flexsearch to version 0.8.205 and adjust search encoder.
* refactor(search): enhance search encoder and update search results type
- Improved the encoder function to filter out empty tokens.
- Updated the search results type from a specific FlexSearch type to a more generic 'any' type for flexibility.
- Removed redundant rtl property from the index configuration.
* refactor(search): remove rtl property from search index configuration
* refactor(search): improve encoder function formatting
- Updated the encoder function to use consistent arrow function syntax for better readability.
* refactor(search): update search results type to DefaultDocumentSearchResults
- Imported DefaultDocumentSearchResults from FlexSearch for improved type safety.
- Changed the type of searchResults from 'any' to DefaultDocumentSearchResults<Item> for better clarity and maintainability.
* fix(flex): respect DesktopOnly and MobileOnly components
* Use classNames util function
* fix(ofm): allow wikilink alias to be empty (#1984)
This is in line with Obsidian's behavior.
* fix(style): Katex adding scrollbars on non-overflowing content (#1989)
* feat(i18n): Bahasa Indonesia translations (#1981)
* fix(a11y): increased content-meta text contrast (#1980)
* fix(analytics): streamline posthog script loading and event capturing (#1974)
* css: adjust color blend for search bg
* feat(links): added ofm option to style unresolved or broken links differently (#1992)
* feat: add option to disable broken wikilinks
* fix(style): update hover color for broken links and introduce new class
* feat: add "disableBrokenWikilinks" option to ObsidianFlavoredMarkdown
* chore(deps): replace `chalk` and `rimraf` with builtin functions (#1879)
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* chore(deps): bump the production-dependencies group across 1 directory with 9 updates (#1996)
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* Node 22 (#1997)
* docs: showcase housekeeping
* docs: fix explorernode references (closes#1985)
* fix: tz-less date parse in local tz instead of utc (closes#1615)
* docs: added note to not forget to add https:// to the plausible-host (for #1337) (#2000)
* docs: added note to not forget to add https:// to the plausible-host (for #1337)
* Update docs/configuration.md
---------
Co-authored-by: Jacky Zhao <j.zhao2k19@gmail.com>
* Updated documentation
---------
Co-authored-by: Nizav <106657905+Ni-zav@users.noreply.github.com>
Co-authored-by: Aswanth <aswanth366@gmail.com>
Co-authored-by: Jacky Zhao <j.zhao2k19@gmail.com>
Co-authored-by: Keisuke ANDO <g.kei0429@gmail.com>
Co-authored-by: fl0werpowers <47599466+fl0werpowers@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Sebastian Moser <64004956+c2vi@users.noreply.github.com>
* fix(popover): automatically position heading links at heading
* Impement linking of blockreferences
* Popover fixes
* id mapping
* Remove excess regexes
* Updated blockref
* Remove linker element
* Restore the docs to their former glory
* Move the hash out of the loop
* Redundant
* Redundant
* Restore docs
* Remove log
* Let it const
* Fix(RecentNotes): Prevent folder pages from always appearing first
Pass prioritizeFolders=false to byDateAndAlphabetical in RecentNotes to sort strictly by date/alphabetical order, fixing issue #1901.
* refactor: split sorting functions for clarity
- Split byDateAndAlphabetical into two separate functions\n- byDateAndAlphabetical: sorts strictly by date and alphabetically\n- byDateAndAlphabeticalFolderFirst: sorts with folders first\n- Updated RecentNotes to use date-only sorting
* Fix(PageList): keep byDateAndAlphabeticalFolderFirst as the default sorting order for PageList
* fix(explorer): vertically center the Explorer toggle under mobile view
* Added a separate title font configuration
* Added googleSubFontHref function
* Applied --titleFont to PageTitle
* Made googleFontHref return array of URLs
* Dealing with empty and undefined title
* Minor update
* Dealing with empty and undefined title
* Refined font inclusion logic
* Adopted the googleFontHref + googleFontSubsetHref method
* Adaptively include font subset for PageTitle
* Restored default config
* Minor changes on configuration docs
* Formatted source code