Natural Language Processing as the automation driver in legal knowledge management

Sarah Maschek
10. Feb. 2020
3 Min. Lesezeit

Aktualisiert: 9. Dez. 2025

Taxy.io shows how NLP can reveal the logic behind German tax law: by mapping semantic networks and detecting meaningful links - like the significance of §370 AO - the AI automatically classifies documents, boosts research accuracy, and frees professionals from hours of manual sorting.

In this article:

How our AI deepens its understanding of German tax law
How machine learning and NLP enable intelligent text understanding
How a small selection of Natural Language Processing methods can look like
What semantic networks reveal about language and meaning
How texts are classified and what this reveals about their structure

How our AI deepens its understanding of German tax law

Knowledge management is an important topic for all tax consulting and auditing firms. It is estimated that tax departments spend millions of hours of work every year on research carried out in the traditional way. Tax professionals need to understand, process and memorize a wealth of constantly changing information. And even though it may not seem so when first looking at legal texts and administrative guidelines, this work is based on the analysis of natural language.

How machine learning and NLP enable intelligent text understanding

This is where the methods of Natural Language Processing (NLP) come into play, through which an algorithm is trained to read and understand text. NLP does not use a static set of methods, but a collection of approaches that are constantly evolving. At the same time, machine learning methods are used to continuously improve the accuracy of hits.

And what does this actually mean?

How a small selection of Natural Language Processing methods can look like

Since “artificial intelligence” and “machine learning” are becoming more and more popular buzzwords which are often used to describe procedures somewhat unrelated to AI, in this article we would like to give a short, non-exhaustive overview of some of the NLP methods we use at Taxy.io.

What semantic networks reveal about language and meaning

Looking at the extensive primary and secondary literature on tax law, the first step is to examine which paragraphs of the literature and related fields are linked, for example by cross-references. Behind this approach lie rule-based procedures plus — and this is where artificial intelligence comes into play — machine learning, so the algorithm recognizes what is meant even in the case of spelling mistakes, for example.

Network analysis on German tax laws and judgements with the highly interlinked tax evasion paragraph 370 AO in the centre.

Next, we can calculate the importance of references. When focusing on German tax law, it is noticeable that in the literature the “Abgabenordnung” (AO; Fiscal Code) is of particular importance. As this is the so-called basic tax law, this insight is obvious at first glance. Looking in more detail at the context of the network analysis for “Abgabenordnung”, it is striking that § 370 AO in particular is referenced very often; this article is dedicated to the topic of tax evasion.

If you would like to read more about the results of our network analysis, please refer to the article by our co-founders Daniel Kirch and Sven Weber.

How texts are classified and what this reveals about their structure

Switching from the observation of the network to specific texts, text classification, among other things, plays an important role in Natural Language Processing. Texts are assigned to certain categories, for example, emails can be classified as spam or not spam, or customer ratings as positive or negative using sentiment analysis. Texts can also be assigned to specific subject areas. In terms of tax law, this means that our algorithms, which have been specially refined with the help of supervised learning techniques, can recognize which tax law topics are covered in a given text and automatically assign them, for example, to the subject area of VAT or procedural law.

Text and topic identification: here, texts have been recognized as judgments and decisions, topics classified, and legislative bodies identified

Natural Language Processing as the automation driver in legal knowledge management

How our AI deepens its understanding of German tax law

How machine learning and NLP enable intelligent text understanding

How a small selection of Natural Language Processing methods can look like

What semantic networks reveal about language and meaning

How texts are classified and what this reveals about their structure

Aktuelle Beiträge

Produkte

Ressourcen

Company