* feat: improve search tokenization for CJK languages
Enhance the encoder function to properly tokenize CJK (Chinese, Japanese,
Korean) characters while maintaining English word tokenization. This fixes
search issues where CJK text was not searchable due to whitespace-only
splitting.
Changes:
- Tokenize CJK characters (Hiragana, Katakana, Kanji, Hangul) individually
- Preserve whitespace-based tokenization for non-CJK text
- Support mixed CJK/English content in search queries
This addresses the CJK search issues reported in #2109 where Japanese text
like "て以来" was not searchable because the encoder only split on whitespace.
Tested with Japanese, Chinese, and Korean content to verify character-level
tokenization works correctly while maintaining English search functionality.
* perf: optimize CJK search encoder with manual buffer tracking
Replace regex-based tokenization with index-based buffer management.
This improves performance by ~2.93x according to benchmark results.
- Use explicit buffer start/end indices instead of string concatenation
- Replace split(/\s+/) with direct whitespace code point checks
- Remove redundant filter() operations
- Add CJK Extension A support (U+20000-U+2A6DF)
Performance: ~878ms → ~300ms (100 iterations, mixed CJK/English text)
* test: add comprehensive unit tests for CJK search encoder
Add 21 unit tests covering:
- English word tokenization
- CJK character-level tokenization (Japanese, Korean, Chinese)
- Mixed CJK/English content
- Edge cases
All tests pass, confirming the encoder correctly handles CJK text.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
---------
Co-authored-by: Claude <noreply@anthropic.com>
In 'processors/parse.ts' the 'remarkRehype' plugin is used with
'allowDangerousHtml' enabled, but that needs to be combined with (e.g.)
'rehypeRaw' to have any effect on the output.
* Add rtl automatic detection to base.scss
* Implement RTL support for Arabic and Persian locales and update HTML direction attribute in renderPage component
* Update HTML direction attribute in renderPage component to prioritize frontmatter dir value
* Refactor renderPage component to simplify HTML direction attribute assignment by removing frontmatter dir fallback
* chore(deps): update flexsearch to version 0.8.205 and adjust search encoder.
* refactor(search): enhance search encoder and update search results type
- Improved the encoder function to filter out empty tokens.
- Updated the search results type from a specific FlexSearch type to a more generic 'any' type for flexibility.
- Removed redundant rtl property from the index configuration.
* refactor(search): remove rtl property from search index configuration
* refactor(search): improve encoder function formatting
- Updated the encoder function to use consistent arrow function syntax for better readability.
* refactor(search): update search results type to DefaultDocumentSearchResults
- Imported DefaultDocumentSearchResults from FlexSearch for improved type safety.
- Changed the type of searchResults from 'any' to DefaultDocumentSearchResults<Item> for better clarity and maintainability.
Hot reload was not updating pages when editing Markdown files inside subfolders on Windows only.
The issue was caused by inconsistent path separators from chokidar.
This patch ensures paths are normalized with toPosixPath before rebuild.
There was no dark mode support for quartz equations in typst. This
commit implements such support in order for proper render of typst math
equation in dark mode.
1. Should not create new instance after count.js, as it already setup an
instance.
2. Script block would be removed when navigating with SPA, and it cause
count.js can not find endpoint by query. The solution is to set
endpoint manually.
* fix(flex): respect DesktopOnly and MobileOnly components
* Use classNames util function
* fix(ofm): allow wikilink alias to be empty (#1984)
This is in line with Obsidian's behavior.
* fix(style): Katex adding scrollbars on non-overflowing content (#1989)
* feat(i18n): Bahasa Indonesia translations (#1981)
* fix(a11y): increased content-meta text contrast (#1980)
* fix(analytics): streamline posthog script loading and event capturing (#1974)
* css: adjust color blend for search bg
* feat(links): added ofm option to style unresolved or broken links differently (#1992)
* feat: add option to disable broken wikilinks
* fix(style): update hover color for broken links and introduce new class
* feat: add "disableBrokenWikilinks" option to ObsidianFlavoredMarkdown
* chore(deps): replace `chalk` and `rimraf` with builtin functions (#1879)
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* chore(deps): bump the production-dependencies group across 1 directory with 9 updates (#1996)
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* Node 22 (#1997)
* docs: showcase housekeeping
* docs: fix explorernode references (closes#1985)
* fix: tz-less date parse in local tz instead of utc (closes#1615)
* docs: added note to not forget to add https:// to the plausible-host (for #1337) (#2000)
* docs: added note to not forget to add https:// to the plausible-host (for #1337)
* Update docs/configuration.md
---------
Co-authored-by: Jacky Zhao <j.zhao2k19@gmail.com>
* Updated documentation
---------
Co-authored-by: Nizav <106657905+Ni-zav@users.noreply.github.com>
Co-authored-by: Aswanth <aswanth366@gmail.com>
Co-authored-by: Jacky Zhao <j.zhao2k19@gmail.com>
Co-authored-by: Keisuke ANDO <g.kei0429@gmail.com>
Co-authored-by: fl0werpowers <47599466+fl0werpowers@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Sebastian Moser <64004956+c2vi@users.noreply.github.com>
* feat: add option to disable broken wikilinks
* fix(style): update hover color for broken links and introduce new class
* feat: add "disableBrokenWikilinks" option to ObsidianFlavoredMarkdown