MCPSERV.CLUB
LanguageTool

LanguageTool

Self-Hosted

Open‑source AI grammar and style checker for 20+ languages

Active(100)
13.7kstars
0views
Updated 17 hours ago

Overview

Discover what makes LanguageTool powerful

LanguageTool is an open‑source, rule‑based proofreading engine that supports more than 20 languages. From a developer’s perspective it is a highly extensible Java library and HTTP server that can be embedded in web services, IDEs, or desktop applications. Its core engine parses text into a syntactic tree, applies language‑specific rule sets (grammar, style, punctuation), and returns suggestions in a structured JSON format. The system is designed to be language‑agnostic; new rules are added via XML or Java, and the engine automatically loads them at runtime.

Language

Rule Engine

Data Store

Extensibility

Overview

LanguageTool is an open‑source, rule‑based proofreading engine that supports more than 20 languages. From a developer’s perspective it is a highly extensible Java library and HTTP server that can be embedded in web services, IDEs, or desktop applications. Its core engine parses text into a syntactic tree, applies language‑specific rule sets (grammar, style, punctuation), and returns suggestions in a structured JSON format. The system is designed to be language‑agnostic; new rules are added via XML or Java, and the engine automatically loads them at runtime.

Architecture

  • Language: The core is written in Java (OpenJDK 11+), packaged as a single JAR that can be run with java -jar. The HTTP server is built on the lightweight Spark framework, exposing a RESTful API (/v2/check).
  • Rule Engine: Rules are defined in XML (rule files) or Java classes, each implementing a Rule interface. The engine compiles these into a fast in‑memory rule set and caches it per language to avoid repeated parsing.
  • Data Store: By default the server is stateless; it keeps no persistent data. Optional extensions (e.g., a custom dictionary or user preferences) can be backed by an embedded H2 database or external PostgreSQL.
  • Extensibility: The API exposes a plugin hook (addRule(Rule rule)) and a public HTTP endpoint for custom rules, allowing developers to integrate domain‑specific checks (e.g., legal or medical terminology). Webhooks can be configured to trigger on check completion for CI pipelines.

Core Capabilities

  • HTTP API: Full Swagger‑defined REST interface (POST /v2/check) that accepts plain text, JSON payloads with language tags, and returns suggestions, confidence scores, and rule IDs.
  • Java API: JLanguageTool class offers programmatic access—parse, get matches, apply corrections. Javadoc is comprehensive and includes examples for custom rule creation.
  • Rule Customization: Rules can be overridden or extended via XML (<rule> elements) or by subclassing Rule. The development overview documents the rule‑creation workflow.
  • Multi‑language Support: Each language has its own rule set and dictionaries; the engine selects them based on the language parameter.

Deployment & Infrastructure

  • Self‑Hosting: A single JAR can be deployed on any JVM‑capable host. The bundled install.sh script automates dependency installation (Java, optional database).
  • Containerization: Community Docker images exist; the base image is openjdk:17-jdk-slim. The container exposes port 8080 and can be scaled horizontally behind a load balancer.
  • Scalability: Statelessness allows easy horizontal scaling; the rule cache can be shared via Redis if needed. For high‑throughput scenarios, a reverse proxy (NGINX) can handle TLS termination and request throttling.
  • Resource Footprint: A single instance consumes ~200 MB RAM and runs comfortably on a 2‑core CPU, making it suitable for edge deployments or lightweight CI runners.

Integration & Extensibility

  • Plugins: The server can load additional rule JARs at runtime; community plugins add support for Microsoft Office, LibreOffice, or custom IDE integrations.
  • Webhooks & Callbacks: Developers can register endpoints to receive asynchronous notifications after a check, enabling integration with CI/CD pipelines or chatops.
  • Customization: User dictionaries can be injected via the API; custom styles (e.g., British vs. American English) are selectable by language variant codes (en-GB, en-US).
  • SDKs: While the primary SDK is Java, community bindings exist for Python (languagetool-python) and Node.js (languagetool-api), simplifying integration in polyglot stacks.

Developer Experience

  • Documentation: The project hosts extensive docs—API reference, rule‑creation guides, and a Swagger UI. Code comments are concise yet informative.
  • Community: A dedicated forum and GitHub issue tracker provide prompt support; contributors can submit rule proposals via pull requests.
  • Licensing: LGPL 2.1 allows embedding in both open‑source and proprietary projects, provided modifications to the core are disclosed.
  • Testing: A robust suite of unit tests (JUnit) covers rule logic; CI pipelines run nightly builds and integration tests against the Docker image.

Use Cases

ScenarioWhy LanguageTool Fits
CI/Quality GatesRun POST /v2/check on pull requests to enforce style guidelines before merge.
Enterprise Writing AssistantsEmbed the Java API in internal document editors to provide real‑time grammar checks.
Multilingual ChatbotsValidate user input across languages before processing or storing.
Content Management SystemsHook into CMS workflows to flag errors during content creation or review.
Educational PlatformsOffer students automated feedback on essays in multiple languages.

Advantages

  • Performance: Rule engine is compiled and cached; latency < 30 ms for typical sentences.
  • Flexibility: Full control over rule sets; developers can add or remove rules without redeploying the entire server.
  • Open Source & License: LGPL allows commercial use without licensing fees, unlike proprietary grammar services.
  • Self‑Hosting: No reliance on external APIs; data never leaves the organization, satisfying strict privacy

Open SourceReady to get started?

Join the community and start self-hosting LanguageTool today

Weekly Views

Loading...
Support Us

Featured Project

$30/month

Get maximum visibility with featured placement and special badges

Repository Health

Loading health data...

Information

Category
other
License
LGPL-2.1
Stars
13.7k
Technical Specs
Pricing
Open Source
Database
None
Docker
Community
Supported OS
LinuxDocker
Author
languagetool-org
languagetool-org
Last Updated
17 hours ago