This article demonstrates a method for efficiently counting tokens in text for use with Large Language Models (LLMs) like Gemma in Rust. By leveraging the tokenizers and hf_hub crates, developers can perform token counting locally, avoiding slower network calls to external endpoints. This approach is ideal for offline usage, reduced complexity, and improved performance. The code snippet provides a practical example, while the article also addresses potential build issues on specific architectures, such as aarch64-pc-windows-msvc.