🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
159 lines
5.3 KiB
Markdown
159 lines
5.3 KiB
Markdown
# Text Encoding Component
|
||
|
||
UTF-8 text encoding/decoding utilities with automatic polyfill support for React Native.
|
||
|
||
## Features
|
||
|
||
- **Standard Compliance**: Compatible with the standard TextEncoder/TextDecoder Web API
|
||
- **React Native Support**: Automatic polyfills for environments without native support
|
||
- **UTF-8 Only**: Focused implementation supporting only UTF-8 encoding for reliability
|
||
- **Performance**: Uses native implementations when available, falls back to efficient polyfills
|
||
- **TypeScript**: Full TypeScript support with comprehensive type definitions
|
||
|
||
## Installation
|
||
|
||
This package is designed to be loaded via IOR (Interoperable Object Reference) from Gitea:
|
||
|
||
```typescript
|
||
import { textEncoding } from 'ior:gitea:gitea.metatrom.net:universal-components/text-encoding@1.0.0';
|
||
```
|
||
|
||
## Usage
|
||
|
||
### Simple Text Encoding/Decoding
|
||
|
||
The easiest way to use this component is through the default service:
|
||
|
||
```typescript
|
||
import { textEncoding } from 'ior:gitea:gitea.metatrom.net:universal-components/text-encoding@1.0.0';
|
||
|
||
// Encode string to bytes
|
||
const encoded = textEncoding.encode('Hello, 世界! 🌍');
|
||
console.log(encoded); // Uint8Array
|
||
|
||
// Decode bytes to string
|
||
const decoded = textEncoding.decode(encoded);
|
||
console.log(decoded); // "Hello, 世界! 🌍"
|
||
```
|
||
|
||
### Factory Functions
|
||
|
||
For more control, use the factory functions:
|
||
|
||
```typescript
|
||
import { createTextEncoder, createTextDecoder } from 'ior:gitea:gitea.metatrom.net:universal-components/text-encoding@1.0.0';
|
||
|
||
const encoder = createTextEncoder();
|
||
const decoder = createTextDecoder();
|
||
|
||
const bytes = encoder.encode('Hello World');
|
||
const text = decoder.decode(bytes);
|
||
```
|
||
|
||
### Advanced Usage
|
||
|
||
Create decoder instances with options:
|
||
|
||
```typescript
|
||
import { createTextDecoder } from 'ior:gitea:gitea.metatrom.net:universal-components/text-encoding@1.0.0';
|
||
|
||
// Throw on invalid sequences instead of using replacement character
|
||
const fatalDecoder = createTextDecoder('utf-8', { fatal: true });
|
||
|
||
// Ignore byte order mark
|
||
const ignoreBomDecoder = createTextDecoder('utf-8', { ignoreBOM: true });
|
||
```
|
||
|
||
### Direct Polyfill Usage
|
||
|
||
Access polyfill classes directly for advanced use cases:
|
||
|
||
```typescript
|
||
import { TextEncoderPolyfill, TextDecoderPolyfill } from 'ior:gitea:gitea.metatrom.net:universal-components/text-encoding@1.0.0';
|
||
|
||
const encoder = new TextEncoderPolyfill();
|
||
const decoder = new TextDecoderPolyfill('utf-8', { fatal: false });
|
||
```
|
||
|
||
## API Reference
|
||
|
||
### textEncoding (Default Service)
|
||
|
||
The main service instance with convenient methods:
|
||
|
||
- `encode(text: string): Uint8Array` - Encode string to UTF-8 bytes
|
||
- `decode(bytes: Uint8Array | ArrayBuffer | number[]): string` - Decode bytes to string
|
||
- `stringToUtf8(text: string): Uint8Array` - Alias for encode()
|
||
- `utf8ToString(bytes: Uint8Array | number[]): string` - Alias for decode()
|
||
|
||
### Factory Functions
|
||
|
||
- `createTextEncoder(): ITextEncoder` - Create encoder instance
|
||
- `createTextDecoder(label?: string, options?: TextDecoderOptions): ITextDecoder` - Create decoder instance
|
||
- `installTextEncodingPolyfills(): void` - Install global polyfills
|
||
|
||
### Interfaces
|
||
|
||
#### ITextEncoder
|
||
|
||
- `encoding: string` - Always 'utf-8'
|
||
- `encode(input?: string): Uint8Array` - Encode string to bytes
|
||
- `encodeInto(source: string, destination: Uint8Array): TextEncoderEncodeIntoResult` - Encode into existing array
|
||
|
||
#### ITextDecoder
|
||
|
||
- `encoding: string` - Always 'utf-8'
|
||
- `fatal: boolean` - Whether to throw on invalid sequences
|
||
- `ignoreBOM: boolean` - Whether to ignore byte order mark
|
||
- `decode(input?: ArrayBufferView | ArrayBuffer, options?: TextDecodeOptions): string` - Decode bytes to string
|
||
|
||
## Error Handling
|
||
|
||
The component handles various error conditions gracefully:
|
||
|
||
```typescript
|
||
import { textEncoding } from 'ior:gitea:gitea.metatrom.net:universal-components/text-encoding@1.0.0';
|
||
|
||
// Invalid UTF-8 sequences are replaced with <20> (U+FFFD) by default
|
||
const invalidBytes = new Uint8Array([0xFF, 0xFE, 0xFD]);
|
||
const result = textEncoding.decode(invalidBytes);
|
||
console.log(result); // "<22><><EFBFBD>"
|
||
|
||
// Use fatal mode to throw on errors
|
||
import { createTextDecoder } from 'ior:gitea:gitea.metatrom.net:universal-components/text-encoding@1.0.0';
|
||
const fatalDecoder = createTextDecoder('utf-8', { fatal: true });
|
||
try {
|
||
fatalDecoder.decode(invalidBytes);
|
||
} catch (error) {
|
||
console.error('Invalid UTF-8 sequence:', error.message);
|
||
}
|
||
```
|
||
|
||
## Platform Support
|
||
|
||
- **React Native**: Full support with automatic polyfills
|
||
- **Node.js**: Uses native TextEncoder/TextDecoder when available
|
||
- **Browsers**: Uses native implementations in modern browsers
|
||
- **Automatic Fallback**: Seamlessly falls back to polyfills when native support is unavailable
|
||
|
||
## Performance Notes
|
||
|
||
- Native implementations are preferred when available for optimal performance
|
||
- Polyfills are optimized for correctness and reasonable performance
|
||
- UTF-8 validation is performed to ensure data integrity
|
||
- Surrogate pair handling for proper Unicode support
|
||
|
||
## Unicode Support
|
||
|
||
This implementation fully supports the Unicode standard:
|
||
|
||
- All valid Unicode code points (U+0000 to U+10FFFF)
|
||
- Proper surrogate pair handling for characters above U+FFFF
|
||
- UTF-8 validation with proper error handling
|
||
- BOM (Byte Order Mark) support with optional ignoring
|
||
|
||
## Version Information
|
||
|
||
- Version: 1.0.0
|
||
- Component Name: text-encoding
|
||
- IOR: `ior:gitea:gitea.metatrom.net:universal-components/text-encoding@1.0.0` |