Papers
arxiv:2509.01110

NoLBERT: A No Lookahead(back) Foundational Language Model for Empirical Research

Published on Sep 1
Authors:
,

Abstract

NoLBERT, a timestamped language model pre-trained on 1976-1995 text, avoids biases and excels in NLP tasks while enabling the analysis of innovation networks and predicting long-term profits.

AI-generated summary

We present NoLBERT, a lightweight, timestamped foundational language model for empirical research in social sciences, particularly in economics and finance. By pre-training exclusively on 1976-1995 text, NoLBERT avoids both lookback and lookahead biases that can undermine econometric inference. It exceeds domain-specific baselines on NLP benchmarks while maintaining temporal consistency. Applied to patent texts, NoLBERT enables the construction of firm-level innovation networks and shows that gains in innovation centrality predict higher long-run profit growth.

Community

Sign up or log in to comment

Models citing this paper 1

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2509.01110 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2509.01110 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.