Scores on benchmarks

Model rank shown below is with respect to all public models.

.433	average_language rank 10 3 benchmarks	.433 0 ceiling best median

.580	neural_language rank 10 2 benchmarks	.580 0 ceiling best median

.580	Pereira2018-linear rank 10 2 benchmarks	.580 0 ceiling best median

.568	Pereira2018.384sentences-linear v1 rank 10	.568 0 ceiling best median

.593	Pereira2018.243sentences-linear v1 rank 10	.593 0 ceiling best median

.286	behavior_language rank 10 1 benchmark	.286 0 ceiling best median

.286	Futrell2018-pearsonr v1 [reference] rank 10	.286 0 ceiling best median

.408	engineering_language rank 10 30 benchmarks	.408 0 ceiling best median

.408	SyntaxGym [reference] rank 10 30 benchmarks	.408 0 ceiling best median

.500	syntaxgym-npi_src_ever v1 [reference] rank 8	.500 0 ceiling best median

.368	syntaxgym-reflexive_orc_fem v1 [reference] rank 4	.368 0 ceiling best median

.737	syntaxgym-number_prep v1 [reference] rank 7	.737 0 ceiling best median

.632	syntaxgym-reflexive_orc_masc v1 [reference] rank 7	.632 0 ceiling best median

.526	syntaxgym-number_orc v1 [reference] rank 7	.526 0 ceiling best median

.711	syntaxgym-npi_orc_any v1 [reference] rank 8	.711 0 ceiling best median

.158	syntaxgym-npi_orc_ever v1 [reference] rank 9	.158 0 ceiling best median

.211	syntaxgym-reflexive_src_fem v1 [reference] rank 7	.211 0 ceiling best median

.632	syntaxgym-reflexive_prep_masc v1 [reference] rank 7	.632 0 ceiling best median

.607	syntaxgym-center_embed_mod v1 [reference] rank 10	.607 0 ceiling best median

.632	syntaxgym-reflexive_prep_fem v1 [reference] rank 1	.632 0 ceiling best median

.786	syntaxgym-center_embed v1 [reference] rank 10	.786 0 ceiling best median

.957	syntaxgym-subordination_pp-pp v1 [reference] rank 6	.957 0 ceiling best median

.870	syntaxgym-subordination_orc-orc v1 [reference] rank 8	.870 0 ceiling best median

.913	syntaxgym-subordination v1 [reference] rank 8	.913 0 ceiling best median

.474	syntaxgym-reflexive_src_masc v1 [reference] rank 8	.474 0 ceiling best median

.913	syntaxgym-subordination_src-src v1 [reference] rank 7	.913 0 ceiling best median

.921	syntaxgym-npi_src_any v1 [reference] rank 6	.921 0 ceiling best median

.684	syntaxgym-number_src v1 [reference] rank 9	.684 0 ceiling best median

How to use

from brainscore_language import load_model
model = load_model("lm1b")
model.start_task(...)
model.start_recording(...)
model.look_at(...)

Model API

Code examples

Benchmarks bibtex

@proceedings{futrell2018natural,
  title={The Natural Stories Corpus},
  author={Futrell, Richard and Gibson, Edward and Tily, Harry J. and Blank, Idan and Vishnevetsky, Anastasia and
          Piantadosi, Steven T. and Fedorenko, Evelina},
  conference={International Conference on Language Resources and Evaluation (LREC)},
  url={http://www.lrec-conf.org/proceedings/lrec2018/pdf/337.pdf},
  year={2018}
}
        @inproceedings{gauthier-etal-2020-syntaxgym,
    title = "{S}yntax{G}ym: An Online Platform for Targeted Evaluation of Language Models",
    author = "Gauthier, Jon and Hu, Jennifer and Wilcox, Ethan and Qian, Peng and Levy, Roger",
    booktitle = "Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations",
    month = jul,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/2020.acl-demos.10",
    pages = "70--76",
    abstract = "Targeted syntactic evaluations have yielded insights into the generalizations learned by neural network language models. However, this line of research requires an uncommon confluence of skills: both the theoretical knowledge needed to design controlled psycholinguistic experiments, and the technical proficiency needed to train and deploy large-scale language models. We present SyntaxGym, an online platform designed to make targeted evaluations accessible to both experts in NLP and linguistics, reproducible across computing environments, and standardized following the norms of psycholinguistic experimental design. This paper releases two tools of independent value for the computational linguistics community: 1. A website, syntaxgym.org, which centralizes the process of targeted syntactic evaluation and provides easy tools for analysis and visualization; 2. Two command-line tools, {`}syntaxgym{`} and {`}lm-zoo{`}, which allow any user to reproduce targeted syntactic evaluations and general language model inference on their own machine.",
}

Layer Commitment

No layer commitments found for this model. Older submissions might not have stored this information but will be updated when evaluated on new benchmarks.

Visual Angle

None degrees