pub struct InputBuffer { /* private fields */ }
Expand description
InputBuffer - prepares the input data for the analysis
By saying char we actually mean Unicode codepoint here. In the context of this struct these terms are synonyms.
Implementations§
source§impl InputBuffer
impl InputBuffer
sourcepub fn new() -> InputBuffer
pub fn new() -> InputBuffer
Creates new InputBuffer
sourcepub fn reset(&mut self) -> &mut String
pub fn reset(&mut self) -> &mut String
Resets the input buffer, so it could be used to process new input. New input should be written to the returned mutable reference.
sourcepub fn from<T: AsRef<str>>(data: T) -> InputBuffer
pub fn from<T: AsRef<str>>(data: T) -> InputBuffer
Creates input from the passed string. Should be used mostly for tests.
Panics if the input string is too long.
sourcepub fn start_build(&mut self) -> SudachiResult<()>
pub fn start_build(&mut self) -> SudachiResult<()>
Moves InputBuffer into RW state, making it possible to perform edits on it
sourcepub fn build(&mut self, grammar: &Grammar<'_>) -> SudachiResult<()>
pub fn build(&mut self, grammar: &Grammar<'_>) -> SudachiResult<()>
Finalizes InputBuffer state, making it RO
sourcepub fn with_editor<'a, F>(&mut self, func: F) -> SudachiResult<()>
pub fn with_editor<'a, F>(&mut self, func: F) -> SudachiResult<()>
Execute a function which can modify the contents of the current buffer
Edit can borrow &str from the context with the borrow checker working correctly
sourcepub fn refresh_chars(&mut self)
pub fn refresh_chars(&mut self)
Recompute chars from modified string (useful if the processing will use chars)
source§impl InputBuffer
impl InputBuffer
sourcepub fn current_chars(&self) -> &[char]
pub fn current_chars(&self) -> &[char]
Borrow array of current characters
sourcepub fn curr_byte_offsets(&self) -> &[usize]
pub fn curr_byte_offsets(&self) -> &[usize]
Returns byte offsets of current chars
sourcepub fn get_original_index(&self, index: usize) -> usize
pub fn get_original_index(&self, index: usize) -> usize
Get index of the current byte in original sentence Bytes not on character boundaries are not supported
sourcepub fn to_orig_byte_idx(&self, index: usize) -> usize
pub fn to_orig_byte_idx(&self, index: usize) -> usize
Mod Char Idx -> Orig Byte Idx
sourcepub fn to_orig_char_idx(&self, index: usize) -> usize
pub fn to_orig_char_idx(&self, index: usize) -> usize
Mod Char Idx -> Orig Char Idx
sourcepub fn to_curr_byte_idx(&self, index: usize) -> usize
pub fn to_curr_byte_idx(&self, index: usize) -> usize
Mod Char Idx -> Mod Byte Idx
sourcepub fn curr_slice_c(&self, data: Range<usize>) -> &str
pub fn curr_slice_c(&self, data: Range<usize>) -> &str
Input: Mod Char Idx
sourcepub fn orig_slice_c(&self, data: Range<usize>) -> &str
pub fn orig_slice_c(&self, data: Range<usize>) -> &str
Input: Mod Char Idx
pub fn ch_idx(&self, idx: usize) -> usize
sourcepub fn swap_original(&mut self, target: &mut String)
pub fn swap_original(&mut self, target: &mut String)
Swaps original data with the passed location
sourcepub fn into_original(self) -> String
pub fn into_original(self) -> String
Return original data as owned, consuming itself
sourcepub fn can_bow(&self, offset: usize) -> bool
pub fn can_bow(&self, offset: usize) -> bool
Whether the byte can start a new word. Supports bytes not on character boundaries.
sourcepub fn get_word_candidate_length(&self, char_idx: usize) -> usize
pub fn get_word_candidate_length(&self, char_idx: usize) -> usize
Returns char length to the next can_bow point
Used by SimpleOOV plugin
Trait Implementations§
source§impl Clone for InputBuffer
impl Clone for InputBuffer
source§fn clone(&self) -> InputBuffer
fn clone(&self) -> InputBuffer
1.0.0 · source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source
. Read moresource§impl Default for InputBuffer
impl Default for InputBuffer
source§fn default() -> InputBuffer
fn default() -> InputBuffer
source§impl InputTextIndex for InputBuffer
impl InputTextIndex for InputBuffer
source§fn cat_of_range(&self, range: Range<usize>) -> CategoryType
fn cat_of_range(&self, range: Range<usize>) -> CategoryType
source§fn cat_at_char(&self, offset: usize) -> CategoryType
fn cat_at_char(&self, offset: usize) -> CategoryType
source§fn cat_continuous_len(&self, offset: usize) -> usize
fn cat_continuous_len(&self, offset: usize) -> usize
source§fn char_distance(&self, cpt: usize, offset: usize) -> usize
fn char_distance(&self, cpt: usize, offset: usize) -> usize
index
and the char, relative to it by offset
.
Java name: getCodePointsOffsetLengthsource§fn orig_slice(&self, range: Range<usize>) -> &str
fn orig_slice(&self, range: Range<usize>) -> &str
Auto Trait Implementations§
impl Freeze for InputBuffer
impl RefUnwindSafe for InputBuffer
impl Send for InputBuffer
impl Sync for InputBuffer
impl Unpin for InputBuffer
impl UnwindSafe for InputBuffer
Blanket Implementations§
source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
source§unsafe fn clone_to_uninit(&self, dst: *mut T)
unsafe fn clone_to_uninit(&self, dst: *mut T)
clone_to_uninit
)source§impl<T> IntoEither for T
impl<T> IntoEither for T
source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self
into a Left
variant of Either<Self, Self>
if into_left
is true
.
Converts self
into a Right
variant of Either<Self, Self>
otherwise. Read moresource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self
into a Left
variant of Either<Self, Self>
if into_left(&self)
returns true
.
Converts self
into a Right
variant of Either<Self, Self>
otherwise. Read more