Overview
Discover what makes LanguageTool powerful
LanguageTool is an open‑source, rule‑based proofreading engine that supports more than 20 languages. From a developer’s perspective it is a highly extensible Java library and HTTP server that can be embedded in web services, IDEs, or desktop applications. Its core engine parses text into a syntactic tree, applies language‑specific rule sets (grammar, style, punctuation), and returns suggestions in a structured JSON format. The system is designed to be language‑agnostic; new rules are added via XML or Java, and the engine automatically loads them at runtime.
Language
Rule Engine
Data Store
Extensibility
Overview
LanguageTool is an open‑source, rule‑based proofreading engine that supports more than 20 languages. From a developer’s perspective it is a highly extensible Java library and HTTP server that can be embedded in web services, IDEs, or desktop applications. Its core engine parses text into a syntactic tree, applies language‑specific rule sets (grammar, style, punctuation), and returns suggestions in a structured JSON format. The system is designed to be language‑agnostic; new rules are added via XML or Java, and the engine automatically loads them at runtime.
Architecture
- Language: The core is written in Java (OpenJDK 11+), packaged as a single JAR that can be run with
java -jar. The HTTP server is built on the lightweight Spark framework, exposing a RESTful API (/v2/check). - Rule Engine: Rules are defined in XML (rule files) or Java classes, each implementing a
Ruleinterface. The engine compiles these into a fast in‑memory rule set and caches it per language to avoid repeated parsing. - Data Store: By default the server is stateless; it keeps no persistent data. Optional extensions (e.g., a custom dictionary or user preferences) can be backed by an embedded H2 database or external PostgreSQL.
- Extensibility: The API exposes a plugin hook (
addRule(Rule rule)) and a public HTTP endpoint for custom rules, allowing developers to integrate domain‑specific checks (e.g., legal or medical terminology). Webhooks can be configured to trigger on check completion for CI pipelines.
Core Capabilities
- HTTP API: Full Swagger‑defined REST interface (
POST /v2/check) that accepts plain text, JSON payloads with language tags, and returns suggestions, confidence scores, and rule IDs. - Java API:
JLanguageToolclass offers programmatic access—parse, get matches, apply corrections. Javadoc is comprehensive and includes examples for custom rule creation. - Rule Customization: Rules can be overridden or extended via XML (
<rule>elements) or by subclassingRule. The development overview documents the rule‑creation workflow. - Multi‑language Support: Each language has its own rule set and dictionaries; the engine selects them based on the
languageparameter.
Deployment & Infrastructure
- Self‑Hosting: A single JAR can be deployed on any JVM‑capable host. The bundled
install.shscript automates dependency installation (Java, optional database). - Containerization: Community Docker images exist; the base image is
openjdk:17-jdk-slim. The container exposes port 8080 and can be scaled horizontally behind a load balancer. - Scalability: Statelessness allows easy horizontal scaling; the rule cache can be shared via Redis if needed. For high‑throughput scenarios, a reverse proxy (NGINX) can handle TLS termination and request throttling.
- Resource Footprint: A single instance consumes ~200 MB RAM and runs comfortably on a 2‑core CPU, making it suitable for edge deployments or lightweight CI runners.
Integration & Extensibility
- Plugins: The server can load additional rule JARs at runtime; community plugins add support for Microsoft Office, LibreOffice, or custom IDE integrations.
- Webhooks & Callbacks: Developers can register endpoints to receive asynchronous notifications after a check, enabling integration with CI/CD pipelines or chatops.
- Customization: User dictionaries can be injected via the API; custom styles (e.g., British vs. American English) are selectable by language variant codes (
en-GB,en-US). - SDKs: While the primary SDK is Java, community bindings exist for Python (
languagetool-python) and Node.js (languagetool-api), simplifying integration in polyglot stacks.
Developer Experience
- Documentation: The project hosts extensive docs—API reference, rule‑creation guides, and a Swagger UI. Code comments are concise yet informative.
- Community: A dedicated forum and GitHub issue tracker provide prompt support; contributors can submit rule proposals via pull requests.
- Licensing: LGPL 2.1 allows embedding in both open‑source and proprietary projects, provided modifications to the core are disclosed.
- Testing: A robust suite of unit tests (JUnit) covers rule logic; CI pipelines run nightly builds and integration tests against the Docker image.
Use Cases
| Scenario | Why LanguageTool Fits |
|---|---|
| CI/Quality Gates | Run POST /v2/check on pull requests to enforce style guidelines before merge. |
| Enterprise Writing Assistants | Embed the Java API in internal document editors to provide real‑time grammar checks. |
| Multilingual Chatbots | Validate user input across languages before processing or storing. |
| Content Management Systems | Hook into CMS workflows to flag errors during content creation or review. |
| Educational Platforms | Offer students automated feedback on essays in multiple languages. |
Advantages
- Performance: Rule engine is compiled and cached; latency < 30 ms for typical sentences.
- Flexibility: Full control over rule sets; developers can add or remove rules without redeploying the entire server.
- Open Source & License: LGPL allows commercial use without licensing fees, unlike proprietary grammar services.
- Self‑Hosting: No reliance on external APIs; data never leaves the organization, satisfying strict privacy
Open SourceReady to get started?
Join the community and start self-hosting LanguageTool today
Related Apps in other
Immich
Self‑hosted photo and video manager
Syncthing
Peer‑to‑peer file sync, no central server
Strapi
Open-source headless CMS for modern developers
reveal.js
Create stunning web‑based presentations with HTML, CSS and JavaScript
Stirling-PDF
Local web PDF editor with split, merge, convert and more
MinIO
Fast, S3-compatible object storage for AI and analytics
Weekly Views
Repository Health
Information
Tags
Explore More Apps
Azimutt
Explore, design, and document complex database schemas
Mattermost
Secure, self‑hosted team collaboration with chat, voice, and AI
CoreShop
Pimcore-powered eCommerce for precision and scalability
EDA
Simple Self‑Hosted Business Intelligence
Habitica
Gamify your habits and tasks
Signature PDF
Free web tool for signing, editing and managing PDFs