AI Document Classification
oklido uses artificial intelligence to automatically classify documents and extract key information.
How It Works
When you upload a document (or one arrives via email):
- Text Extraction - OCR and text parsing
- Classification - AI identifies the document type
- Entity Recognition - Key dates, amounts, and names are extracted
- Source Matching - AI suggests the document source
- Quality Check - Confidence scoring for accuracy
What Gets Extracted
Document Type
oklido recognises these categories:
| Type | Description | Examples |
|---|---|---|
| Investment Memo | Fund overviews, investment summaries | Fund fact sheets, PPMs |
| Quarterly Report | Performance updates, NAV statements | Q1/Q2/Q3/Q4 reports |
| Distribution Notice | Payout notifications | Dividend notices, distribution letters |
| Capital Call | Funding requests | Capital call notices, drawdown requests |
| Tax Document | Tax-related documents | K-1s, tax certificates, 1099s |
| Legal Document | Legal agreements | Subscription docs, amendments |
| Correspondence | Letters and emails | Cover letters, announcements |
Key Information
The AI extracts:
- Dates - Document date, reporting period, due dates
- Amounts - Investment values, distributions, calls
- Names - Fund names, manager names, counterparties
- Reference Numbers - Account numbers, document IDs
Accuracy & Confidence
Confidence Scoring
Each classification includes a confidence score:
- High (above 90%) - Very confident, auto-applied
- Medium (70-90%) - Likely correct, review recommended
- Low (below 70%) - Uncertain, manual review required
Improving Accuracy
The AI learns from your corrections:
- Review suggested classifications
- Correct any errors
- AI improves for similar documents
Manual Override
You can always override AI suggestions:
- Open the document
- Click "Edit" in the details panel
- Change the document type or source
- Save your changes
Your corrections help improve future classifications.
Supported Languages
Document classification works best with:
- English - Full support
- Other languages - Basic support (text extraction works, classification may be less accurate)
Processing Time
| Document Type | Typical Time |
|---|---|
| Text-based PDF | 2-5 seconds |
| Scanned PDF (OCR) | 5-15 seconds |
| Large documents | Up to 30 seconds |
| Bulk uploads | Queued processing |
Privacy & Security
Your documents are processed securely:
- Encrypted in transit using TLS 1.3
- Encrypted at rest using AES-256
- EU data residency - All processing in London (AWS eu-west-2)
- No third-party AI - Processing uses our own models
- GDPR compliant - Full data subject rights
Limitations
The AI works best with:
- Standard financial document formats
- Clear, readable text
- Documents in English
May struggle with:
- Handwritten documents
- Poor quality scans
- Unusual document formats
- Non-English languages