Key Features
- Flexible GenAI Model Selection: Empower your analysis by choosing from a wide range of leading Generative AI models. The system supports multiple providers, including OpenAI, Azure OpenAI, Google Gemini, Anthropic, Perplexity, and DeepSeek. For full control and privacy, it also integrates with local Ollama models.
- Automated Metadata Extraction: Automatically parse PDF and DOCX documents to extract a detailed, structured set of metadata. The AI is specifically tuned to find key information like policy titles, geographic scope, publication dates, focus areas, and implementation stages.
- In-Depth Policy Gap Analysis: Move beyond metadata and evaluate documents against critical principles. The system performs a gap analysis against a fully customizable set of criteria, such as Accountability, Transparency, and Fairness.
- Customizable Analysis Principles: A dedicated "Manage Prompts" interface allows you to create, edit, and save the very principles used for analysis. This ensures the system evaluates documents against your organization's specific standards.
- Seamless Airtable Integration: Instantly save your extracted metadata to a configured Airtable base with a single click. This creates a structured, shareable, and analyzable database of all processed policies.
- Versatile Data Export: Download your results in convenient formats. Extracted metadata is available as a JSON file, and the detailed gap analysis report can be saved as a clean, easy-to-read DocX, PDF, or Markdown file.
Applications & Use Cases
Designed for versatility, the platform adapts to the needs of researchers, policymakers, and compliance officers across various sectors.
🌍 National Policy Alignment
Benchmark national strategies against global frameworks (e.g., SDGs, AI Ethics guidelines) to visualize alignment and identify opportunities for harmonization.
📊 Cross-Jurisdictional Analysis
Standardize and compare regulatory approaches across multiple countries or regions to identify trends, outliers, and emerging best practices.
🔒 Secure Internal Auditing
Utilize local offline models (Air-Gapped) to review sensitive HR, legal, or security policies, ensuring strict compliance without exposing data to the cloud.
📂 Institutional Knowledge Mining
Transform "dark data" (static PDFs) into structured, searchable assets (Airtable), allowing organizations to visualize their policy landscape over time.
Strategic Architecture & Roadmap
Our platform's multi-model approach is a strategic necessity. It allows users to decouple intelligence from cost, ensuring that the engine used matches the cognitive load of the task.
1. Nuanced Model Routing
The choice is no longer just between "Smart/Expensive" and "Fast/Cheap." Disruptors like DeepSeek have entered the market, offering frontier-class reasoning at significantly lower price points. Our architecture supports three distinct tiers of execution:
| Task Tier | Use Case | Recommended Models | Strategic Value |
|---|---|---|---|
| 1. Structural Extraction | Formatting Dates, Authors, IDs, JSON Compliance. | GPT-4o-mini, Gemini Flash | Lowest Cost Ideal for high-speed, repetitive data entry. |
| 2. Efficient Reasoning | Complex Policy Analysis, Gap Detection. | DeepSeek V3, Llama 3 (via Groq/Ollama) | High Power / Low Cost Delivers GPT-4 class logic at a fraction of the price. |
| 3. Frontier Analysis | Nuanced interpretation of ambiguous legal clauses. | Claude Sonnet/Opus, GPT-5 | Maximum Nuance Reserved for cases where brand trust outweighs cost. |
2. Future Roadmap: Batch Scalability Planned Phase
From Calibration to Automation: Currently, the Streamlit interface serves as a "Calibration Lab," allowing users to refine prompts and identify the best LLM for their specific document types.
The next architectural phase will introduce Batch Mode Processing. By leveraging efficient models like DeepSeek, this mode will enable the system to target specific directories for automated, high-volume execution—scaling operations from single documents to entire repositories with minimal user intervention and optimized spend.
User Interface
The system features a clean, intuitive web interface built with Streamlit, allowing for easy document upload, model selection, and task execution.
(Click image to expand)
Sample Analysis Report
The "Process Analysis" task produces a detailed Markdown report, evaluating the document against each selected principle. This report highlights potential gaps and discrepancies for review.
Sample Metadata Output (JSON)
After a successful "Process Extraction" task, the system generates a structured JSON file based on the document's content. This file is ready for download or for sending directly to Airtable.
{
"Title": "Global Data Privacy and AI Ethics Policy",
"GeographicScopeType": "Global",
"GeographicScopeSpecify": [
"Global",
"EU",
"APAC"
],
"Organization": "TechCorp International",
"PublishedDate": "2025-01-15",
"EffectiveDate": "2025-02-01",
"RevisedDate": "2025-10-01"
}