Wikipedia:Wikipedia Signpost/2023-12-24/Recent research

Recent research

"LLMs Know More, Hallucinate Less" with Wikidata

A monthly overview of recent academic research about Wikipedia and other Wikimedia projects, also published as the Wikimedia Research Newsletter.

"Fine-tuned LLMs Know More, Hallucinate Less with Few-Shot Sequence-to-Sequence Semantic Parsing over Wikidata"

Overview of how the authors' "WikiSP" semantic parser is used to answer a user's question:
"An entity linker is used to link entities in the user query to their unique ID in Wikidata; e.g. “A Bronx Tale” is linked to entity ID “Q1130705”. The query and entity linker outputs are fed to the WikiSP semantic parser to produce a modified version of SPARQL, where property IDs (e.g. “P915”) are replaced by their unique string identifiers (e.g. “filming_location”). If applying the [SPARQL] query to Wikidata fails to return a result, we default to [OpenAI's large language model] GPT-3, labeling the result as a GPT-3 guess. Returned answers are presented in the context of the query, so the user can tell if the answer is acceptable; if not, we also show the guess from GPT-3. Here WikiSP mistakenly uses “filming_location” instead of “narrative_location”; the user detects the mistake, thumbs down the answer, and the GPT-3 answer is provided."

This paper (by five graduate students at Stanford University's computer science department and Monica S. Lam as last author) sets out to show that

To do this, the paper "presents WikiSP, a few-shot sequence-to-sequence semantic parser for Wikidata that translates a user query, along with results from an entity linker, directly into SPARQL queries [to retrieve information from Wikidata]." It is obtained by fine-tuning one of Facebook/Meta LLaMA 1 large language models.

For example, the user question "What year did giants win the world series?" is supposed to be converted into the query SELECT DISTINCT ?x WHERE {?y wdt:sports_season_of_league_or_competition wd:Q265538; wdt:winner wd:Q308966; wdt:point_in_time ?x. }. The paper uses a modified SPARQL syntax that replaces numerical property IDs (here, P3450) with their English-language label (here, "sports season of league or competition"). The authors motivate this choice by observing that "While zero-shot LLMs [e.g. ChatGPT] can generate SPARQL queries for the easiest and most common questions, they do not know all the PIDs and QIDs [property and item IDs in Wikidata], and nor is it possible to include them in a prompt."

To evaluate the performance of "WikiSP", and as a second contribution of the paper, the authors present

Using this new benchmark, "Our experimental results demonstrate the effectiveness of [WikiSP], establishing a strong baseline of 76% and 65% answer accuracy in the dev and test sets of WikiWeb- Questions, respectively." However, the paper's "Limitations" section hints that despite the impressive "12 billion facts" factoid that the paper opens with, Wikidata's coverage may be too limited to answer most user questions in a satisfying manner:

To address this weakness, the authors combine this Wikidata-based setup with a standard LLM that provides the answer if the Wikidata query fails to return a result. They state that

Data and evaluation code from the paper have been released in a GitHub repo, where the authors state that "We are now working on releasing fine-tuned models."

The paper's endeavour bears some similarity to a paper authored by a different team of Stanford graduate students with professor Lam that sought to use Wikipedia (rather than Wikidata) to reduce LLM hallucations, see the review in our July issue: "Wikipedia-based LLM chatbot 'outperforms all baselines' regarding factual accuracy".

Briefly

See the page of the monthly Wikimedia Research Showcase for videos and slides of past presentations.
US-based editors wanted for workshop on research ethics: For a research project titled "Beyond the Individual: Community-Engaged Design and Implementation of a Framework for Ethical Online Communities Research", a team from the University of Minnesota's GroupLens lab is seeking US-based Wikipedia editors to participate in a 2-hour remote workshop, to discuss "ways that research can help or harm the community" (following up on a previous workshop with non-US-based English Wikipedia editors). Interested users can sign up here.

Other recent publications

Other recent publications that could not be covered in time for this issue include the items listed below. Contributions, whether reviewing or summarizing newly published research, are always welcome.

"Using Large Language Models for Knowledge Engineering (LLMKE): A Case Study on Wikidata"

From the abstract:

"Large language models learn to organize concepts in ways that are strikingly similar to how concepts are organized in [Wikidata]"

From the abstract:

"Enhancing Multilingual Language Model with Massive Multilingual Knowledge Triples" from Wikidata

From the abstract:

"KGConv, a Conversational Corpus grounded in Wikidata"

From the abstract:

"WikiDialog" dataset: "Dialog inpainting" using Wikipedia

From the abstract of a paper presented by a team of Google researchers at last year's ICML conference:

As "a real example of a dialog inferred from a Wikipedia passage using dialog inpainting" the paper presents the following (abridged) exchange between an "imagined reader" of the Freshman 15 article and a Wikipedia "Writer" who (after the initial greeting) always answers with excerpts from the article, with all other sentences filled in by the inpainter:

Writer: "I'm an automated assistant. I can tell you about Freshman 15."
Imagined Reader: "How does the freshman 15 relate to eating habits?"
Writer: "In parental-supervised diets, students also usually ingest the proper proportion [...]"
Imagined Reader: "What is the cause of this?"
Writer: "This is because when students go off to college, they face an independence [...]"
Imagined Reader: "Do people tend to eat healthier or less healthy when they are away from home?"
Writer: "Research has shown that over 60 percent of college students commonly ingest [...]"

Wikipedia-based "Retrieval Augmentation Reduces Hallucination in Conversation" with large language models

From the abstract of a 2021 paper by a team from Facebook AI Research:

Large language models as an alternative to Wikidata?

From the abstract:

The authors acknowledge that "Starting from [a 2019 paper], many works have explored whether this LM-as-KB paradigm [i.e. the ability of LLMs to answer factual questions, by now familiar to users of ChatGPT] could provide an alternative to structured knowledge bases such as Wikidata. However, the paper concludes, as of 2021,

References

← Previous "Recent research"

Next "Recent research" →

In this issue

24 December 2023 (all comments)

Recent research

Discuss this story

These comments are automatically transcluded from this article's talk page. To follow comments, add the page to your watchlist. If your comment has not appeared here, you can try purging the cache.

The value of Wikidata becoming clearer in a world with AI is an unsurprising development. It'll be similar with Abstract Wikipedia. {{u|Sdkb}} ^talk 15:52, 24 December 2023 (UTC)[reply]

I agree with you, Wikidata is becoming more valuable, especially for AI language models like good old ChatGPT or Bing Chat. - The Master of Hedgehogs ^{(always up for a conversation!)} 17:09, 28 December 2023 (UTC)[reply]

I personally do not support AI given their stratospheric impact towards politics and copyright. I cannot believe what I'm seeing here. MarioJump83 (talk) 00:43, 3 January 2024 (UTC)[reply]

What do you think of The Signpost? Share your feedback.

Home

About