(This article is a rewritten and translated version of a research report by the FrOG team, funded by Wikimedia Indonesia‘s Wikidata Research Fund 2024. The report was presented on behalf of the team by Fariz Darari (University of Indonesia) and Fajar J. Ekaputra (WU Vienna). It has been rewritten by Hisyam on behalf of Wikimedia Indonesia to make the content accessible to a general audience.)
In an effort to support innovative uses of open knowledge, Wikimedia Indonesia launched the Wikidata Research Fund 2024. This program provides funding and institutional support for research that explores the use of Wikidata—a free, collaborative knowledge base—as both a data source and a technological platform. By funding interdisciplinary projects, the initiative aims to foster new ideas that make structured data more usable, discoverable, and impactful—especially within the growing fields of artificial intelligence (AI), education, and data science.
One of the funded research teams came from Universitas Indonesia, working in collaboration with WU Vienna. Their project focused on solving a common problem in today’s AI systems: when language models give confident but incorrect answers—a phenomenon known as hallucination. To tackle this, the team proposed a solution that combines Large Language Models (LLMs) with Knowledge Graphs (KGs) like Wikidata and DBpedia. The outcome was the development of an open-source framework called FrOG—short for Framework of Open GraphRAG. This framework is designed to help machines retrieve better facts and generate more trustworthy answers by grounding them in structured, verifiable data, such as Wikidata KG.
Bridging Knowledge and Language
At the heart of this research lies a growing challenge in modern artificial intelligence: how to make large language models (LLMs) such as ChatGPT or Gemini produce factually accurate and explainable answers. While these models are excellent at generating fluent responses, they often fabricate facts—a problem known as hallucination. This becomes especially problematic in domains where factual grounding is crucial, such as education, law, or scientific research.
To counteract hallucination, researchers have increasingly turned to Knowledge Graphs (KGs)—structured datasets that store information in the form of entities and their relationships. Unlike text-based documents, knowledge graphs like Wikidata and DBpedia allow for precise, machine-readable representations of facts. This makes them suitable for Retrieval-Augmented Generation (RAG), a technique that allows language models to retrieve relevant facts from an external source before generating responses. However, connecting LLMs and KGs requires complex pipelines involving ontology alignment, query generation, and contextual understanding of user intent.
Recent studies have begun to explore this integration. For example, some researchers have introduced approaches such as GraphRAG and KGT5, which attempt to link neural generation models with graph-structured knowledge. However, many of these systems are either monolingual, domain-specific, or not openly accessible. The FrOG project builds on these findings but introduces a system that is open-source, multilingual, and domain-agnostic, aiming to make knowledge-grounded AI more inclusive and reproducible—especially in underrepresented languages like Bahasa Indonesia.
Designing the FrOG
To turn their idea into a working framework, the research team designed FrOG—Framework of Open GraphRAG. The framework’s goal is to connect natural-language questions with accurate data from structured knowledge graphs such as Wikidata, DBpedia, or other custom KGs. Instead of relying solely on pre-trained language model knowledge, FrOG uses a retrieval-augmented pipeline that actively pulls in relevant facts from knowledge graphs before forming an answer. This approach makes the generated responses more grounded, transparent, and adaptable across languages and domains.

The development of FrOG followed two main iterations. The first version, Pipeline v1, tested the basic idea of translating a user’s natural language question into a structured SPARQL query—the standard query language for knowledge graphs. Using a combination of prompt engineering and a base LLM, this pipeline was able to generate queries that retrieved relevant answers from Wikidata. However, it had limitations in understanding more complex sentence structures, handling multilingual input, and adapting across datasets.

To improve performance, the team developed Pipeline v2, a more advanced version that introduced several enhancements. This version incorporated multilingual LLMs (such as Qwen2.5 7B), improved prompt chaining strategies, and most notably, a vector-based ontology retrieval system. The ontology retriever helps the model better understand the structure and terminology of each knowledge graph, boosting accuracy even with domain-specific data. The pipeline also supported multiple datasets—including a manually constructed Curriculum Knowledge Graph based on academic data from Universitas Indonesia—which made the system more versatile in real-world use cases.
Evaluating Accuracy and Multilingual Capabilities
To assess the effectiveness of FrOG, the researchers conducted a series of evaluations across three knowledge graphs: Wikidata, DBpedia, and a custom-built Curriculum Knowledge Graph derived from academic data at Universitas Indonesia. These tests aimed to evaluate how well the system could convert natural-language questions into accurate queries—queries that would return correct and complete answers from the knowledge graph. Accuracy was measured using Jaccard Similarity, a metric that compares the overlap between machine-generated queries and those written by human experts.
The results showed a significant improvement between Pipeline v1 and Pipeline v2. In particular, Pipeline v2, when powered by the Qwen2.5 7B model, achieved the highest performance. On the Curriculum KG dataset, it reached a Jaccard Similarity score of 0.805, indicating that the queries it generated were highly similar to the reference queries manually handcrafted by the researchers. This level of precision suggests that the system not only understood the structure of the questions but also navigated the ontology of the graph effectively.
Moreover, the multilingual capabilities of FrOG proved to be a key strength. Unlike most existing systems, which are often limited to English, FrOG was able to interpret and process questions in Bahasa Indonesia and German, alongside English. This opens up possibilities for localized applications of AI in underrepresented languages—an important goal in the context of Wikimedia’s global mission. The consistent performance across different languages and domains confirms FrOG’s potential as a general-purpose, multilingual knowledge access tool.
Looking ahead
The FrOG project demonstrates the growing potential of combining large language models with structured knowledge graphs to create more reliable, multilingual, and transparent AI systems. By developing an open-source framework that connects natural language input with factual data from KG sources like Wikidata and DBpedia, the research team from Universitas Indonesia and WU Vienna has laid important groundwork for future applications in education, public information systems, and low-resource language support. Their contributions show how open data and open technology can work hand-in-hand to make artificial intelligence more explainable and inclusive.
This research has resulted in two scientific publications. The first, titled “Towards an Open NLI LLM-based System for KGs: A Case Study of Wikidata”, was published in December 2024 in the Proceedings of the International Seminar on Research of Information Technology and Intelligent Systems (ISRITI). This early-stage study introduced the foundational ideas that would later evolve into FrOG. The second paper, “FrOG: Framework of Open GraphRAG”, presents the full design and evaluation of the FrOG system. It was accepted for publication and presented at the LLM-TEXT2KG 2025 Workshop at the European Semantic Web Conference (ESWC). This paper outlines the complete system architecture, multilingual experiments, and its application to both public and domain-specific knowledge graphs. The paper is currently in publication stage, however, a pre-print version is available on the research team’s report page on Meta. In the report, the team also provided a link to the repository for FrOG source code.
Looking ahead, the team plans to further develop FrOG by enhancing support for Bahasa Indonesia, refining how answers are phrased (verbalization), and adding persistent data storage to enable long-term use. These efforts aim not only to improve system performance, but also to encourage the growth of localized, explainable AI powered by open knowledge. The research team also plans to expand the framework for their next project, MEGA-FrOG.
Can you help us translate this article?
In order for this article to reach as many people as possible we would like your help. Can you translate this article to get the message out?
Start translation