diff --git a/.gitattributes b/.gitattributes new file mode 100644 index 0000000000000000000000000000000000000000..f46f00d6ee530dac728923ce79d3f542330db189 --- /dev/null +++ b/.gitattributes @@ -0,0 +1,18 @@ +*.png filter=lfs diff=lfs merge=lfs -text +*.jpg filter=lfs diff=lfs merge=lfs -text +*.jpeg filter=lfs diff=lfs merge=lfs -text +*.gif filter=lfs diff=lfs merge=lfs -text +*.mp4 filter=lfs diff=lfs merge=lfs -text +*.mov filter=lfs diff=lfs merge=lfs -text +*.avi filter=lfs diff=lfs merge=lfs -text +*.csv filter=lfs diff=lfs merge=lfs -text +*.json filter=lfs diff=lfs merge=lfs -text +*.pdf filter=lfs diff=lfs merge=lfs -text +*.wav filter=lfs diff=lfs merge=lfs -text +*.mp3 filter=lfs diff=lfs merge=lfs -text +# the package and package lock should not be tracked +package.json -filter -diff -merge text +package-lock.json -filter -diff -merge text +# Notion imported images should NOT be in LFS (needed for Docker build) +app/src/content/assets/image/image_27877f1c*.png -filter -diff -merge text +app/scripts/notion-importer/output/** -filter -diff -merge text diff --git a/.gitignore b/.gitignore new file mode 100644 index 0000000000000000000000000000000000000000..3344bbd884b24bdc19198c0b2725a89b7593f84e --- /dev/null +++ b/.gitignore @@ -0,0 +1,41 @@ +# Python +__pycache__ +*.py[cod] +*.so +.Python +env/ +venv/ +*.egg-info/ +dist/ +build/ +*.egg +.idea/ +.vscode/ +.astro/ +.claude/ +*.swp +.DS_Store +# Node +node_modules/ +*.log +*.env +*.cache +.notion-to-md + +app/scripts/latex-to-mdx/output/ +app/scripts/notion-importer/output/**/* +app/src/content/embeds/typography/generated + +# PDF export +app/public/*.pdf +app/public/*.png +app/public/*.jpg +app/public/data/**/* + +.astro/ + +# Template sync temporary directories +.template-sync/ +.temp-*/ +.backup-*/ + diff --git a/CHANGELOG.md b/CHANGELOG.md new file mode 100644 index 0000000000000000000000000000000000000000..5837b2b57b8d319f7a12c1b0ff413044b7792f33 --- /dev/null +++ b/CHANGELOG.md @@ -0,0 +1,118 @@ +# Changelog + +All notable changes to the Research Article Template will be documented in this file. + +The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/), +and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). + +## [Unreleased] + +### Added +- Initial open source release +- Comprehensive documentation +- Contributing guidelines +- License file + +## [1.0.0] - 2024-12-19 + +### Added +- **Core Features**: + - Markdown/MDX-based writing system + - KaTeX mathematical notation support + - Syntax highlighting for code blocks + - Academic citations with BibTeX integration + - Footnotes and sidenotes system + - Auto-generated table of contents + - Interactive Mermaid diagrams + - Plotly.js and D3.js integration + - HTML embed support + - Gradio app embedding + - Dataviz color palettes + - Image optimization + - SEO-friendly structure + - Automatic PDF export + - Dark/light theme toggle + - Mobile-responsive design + - LaTeX import functionality + - Template synchronization system + +- **Components**: + - Figure component with captions + - MultiFigure for image galleries + - Note component with variants + - Quote component + - Accordion for collapsible content + - Sidenote component + - Table of Contents + - Theme Toggle + - HTML Embed + - Raw HTML support + - SEO component + - Hero section + - Footer + - Full-width and wide layouts + +- **Build System**: + - Astro 4.10.0 integration + - PostCSS with custom media queries + - Automatic compression + - Docker support + - Nginx configuration + - Git LFS support + +- **Scripts**: + - PDF export functionality + - LaTeX to MDX conversion + - Template synchronization + - Font SVG generation + - TrackIO data generation + +- **Documentation**: + - Getting started guide + - Writing best practices + - Component reference + - LaTeX conversion guide + - Interactive examples + +### Technical Details +- **Framework**: Astro 4.10.0 +- **Styling**: PostCSS with custom properties +- **Math**: KaTeX 0.16.22 +- **Charts**: Plotly.js 3.1.0, D3.js 7.9.0 +- **Diagrams**: Mermaid 11.10.1 +- **Node.js**: >=20.0.0 +- **License**: CC-BY-4.0 + +### Browser Support +- Chrome (latest) +- Firefox (latest) +- Safari (latest) +- Edge (latest) + +--- + +## Version History + +- **1.0.0**: Initial stable release with full feature set +- **0.0.1**: Development version (pre-release) + +## Migration Guide + +### From 0.0.1 to 1.0.0 + +This is the first stable release. No breaking changes from the development version. + +### Updating Your Project + +Use the template synchronization system to update: + +```bash +npm run sync:template -- --dry-run # Preview changes +npm run sync:template # Apply updates +``` + +## Support + +- **Documentation**: [Hugging Face Space](https://huggingface.co/spaces/tfrere/research-article-template) +- **Issues**: [Community Discussions](https://huggingface.co/spaces/tfrere/research-article-template/discussions) +- **Contact**: [@tfrere](https://huggingface.co/tfrere) diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md new file mode 100644 index 0000000000000000000000000000000000000000..a4573b5d9abcd9e9ba35095677d0443b157298ec --- /dev/null +++ b/CONTRIBUTING.md @@ -0,0 +1,196 @@ +# Contributing to Research Article Template + +Thank you for your interest in contributing to the Research Article Template! This document provides guidelines and information for contributors. + +## 🤝 How to Contribute + +### Reporting Issues + +Before creating an issue, please: +1. **Search existing issues** to avoid duplicates +2. **Use the issue template** when available +3. **Provide detailed information**: + - Clear description of the problem + - Steps to reproduce + - Expected vs actual behavior + - Environment details (OS, Node.js version, browser) + - Screenshots if applicable + +### Suggesting Features + +We welcome feature suggestions! Please: +1. **Check existing discussions** first +2. **Describe the use case** clearly +3. **Explain the benefits** for the community +4. **Consider implementation complexity** + +### Code Contributions + +#### Getting Started + +1. **Fork the repository** on Hugging Face +2. **Clone your fork**: + ```bash + git clone git@hf.co:spaces//research-article-template + cd research-article-template + ``` +3. **Install dependencies**: + ```bash + cd app + npm install + ``` +4. **Create a feature branch**: + ```bash + git checkout -b feature/your-feature-name + ``` + +#### Development Workflow + +1. **Make your changes** following our coding standards +2. **Test thoroughly**: + ```bash + npm run dev # Test locally + npm run build # Ensure build works + ``` +3. **Update documentation** if needed +4. **Commit with clear messages**: + ```bash + git commit -m "feat: add new component for interactive charts" + ``` + +#### Pull Request Process + +1. **Push your branch**: + ```bash + git push origin feature/your-feature-name + ``` +2. **Create a Pull Request** with: + - Clear title and description + - Reference related issues + - Screenshots for UI changes + - Testing instructions + +## 📋 Coding Standards + +### Code Style + +- **Use Prettier** for consistent formatting +- **Follow existing patterns** in the codebase +- **Write clear, self-documenting code** +- **Add comments** for complex logic +- **Use meaningful variable names** + +### File Organization + +- **Components**: Place in `src/components/` +- **Styles**: Use CSS modules or component-scoped styles +- **Assets**: Organize in `src/content/assets/` +- **Documentation**: Update relevant `.mdx` files + +### Commit Message Format + +We follow [Conventional Commits](https://www.conventionalcommits.org/): + +``` +type(scope): description + +feat: add new interactive chart component +fix: resolve mobile layout issues +docs: update installation instructions +style: improve button hover states +refactor: simplify component structure +test: add unit tests for utility functions +``` + +**Types**: `feat`, `fix`, `docs`, `style`, `refactor`, `test`, `chore` + +## 🧪 Testing + +### Manual Testing + +Before submitting: +- [ ] Test on different screen sizes +- [ ] Verify dark/light theme compatibility +- [ ] Check browser compatibility (Chrome, Firefox, Safari) +- [ ] Test with different content types +- [ ] Ensure accessibility standards + +### Automated Testing + +```bash +# Run build to catch errors +npm run build + +# Test PDF export +npm run export:pdf + +# Test LaTeX conversion +npm run latex:convert +``` + +## 📚 Documentation + +### Writing Guidelines + +- **Use clear, concise language** +- **Provide examples** for complex features +- **Include screenshots** for UI changes +- **Update both English content and code comments** + +### Documentation Structure + +- **README.md**: Project overview and quick start +- **CONTRIBUTING.md**: This file +- **Content files**: In `src/content/chapters/demo/` +- **Component docs**: Inline comments and examples + +## 🎯 Areas for Contribution + +### High Priority + +- **Bug fixes** and stability improvements +- **Accessibility enhancements** +- **Mobile responsiveness** +- **Performance optimizations** +- **Documentation improvements** + +### Feature Ideas + +- **New interactive components** +- **Additional export formats** +- **Enhanced LaTeX import** +- **Theme customization** +- **Plugin system** + +### Community + +- **Answer questions** in discussions +- **Share examples** of your work +- **Write tutorials** and guides +- **Help with translations** + +## 🚫 What Not to Contribute + +- **Breaking changes** without discussion +- **Major architectural changes** without approval +- **Dependencies** that significantly increase bundle size +- **Features** that don't align with the project's goals + +## 📞 Getting Help + +- **Discussions**: [Community tab](https://huggingface.co/spaces/tfrere/research-article-template/discussions) +- **Issues**: [Report bugs](https://huggingface.co/spaces/tfrere/research-article-template/discussions?status=open&type=issue) +- **Contact**: [@tfrere](https://huggingface.co/tfrere) on Hugging Face + +## 📄 License + +By contributing, you agree that your contributions will be licensed under the same [CC-BY-4.0 license](LICENSE) that covers the project. + +## 🙏 Recognition + +Contributors will be: +- **Listed in acknowledgments** (if desired) +- **Mentioned in release notes** for significant contributions +- **Credited** in relevant documentation + +Thank you for helping make scientific writing more accessible and interactive! 🎉 diff --git a/Dockerfile b/Dockerfile new file mode 100644 index 0000000000000000000000000000000000000000..8073f800d5831c53c02b9758b1282cbc6f7ef718 --- /dev/null +++ b/Dockerfile @@ -0,0 +1,77 @@ +# Use an official Node runtime as the base image for building the application +# Build with Playwright (browsers and deps ready) +FROM mcr.microsoft.com/playwright:v1.55.0-jammy AS build + +# Install git, git-lfs, and dependencies for Pandoc (only if ENABLE_LATEX_CONVERSION=true) +RUN apt-get update && apt-get install -y git git-lfs wget && apt-get clean + +# Install latest Pandoc from GitHub releases (only installed if needed later) +RUN wget -qO- https://github.com/jgm/pandoc/releases/download/3.8/pandoc-3.8-linux-amd64.tar.gz | tar xzf - -C /tmp && \ + cp /tmp/pandoc-3.8/bin/pandoc /usr/local/bin/ && \ + cp /tmp/pandoc-3.8/bin/pandoc-lua /usr/local/bin/ && \ + rm -rf /tmp/pandoc-3.8 + +# Set the working directory in the container +WORKDIR /app + +# Copy package.json and package-lock.json +COPY app/package*.json ./ + +# Install dependencies +RUN npm install + +# Copy the rest of the application code +COPY app/ . + +# Conditionally convert LaTeX to MDX if ENABLE_LATEX_CONVERSION=true +ARG ENABLE_LATEX_CONVERSION=false +RUN if [ "$ENABLE_LATEX_CONVERSION" = "true" ]; then \ + echo "🔄 LaTeX importer enabled - running latex:convert..."; \ + npm run latex:convert; \ + else \ + echo "⏭️ LaTeX importer disabled - skipping..."; \ + fi + +# Pre-install notion-importer dependencies (for runtime import) +# Note: Notion import is done at RUNTIME (not build time) to access secrets +RUN cd scripts/notion-importer && npm install && cd ../.. + +# Ensure `public/data` is a real directory with real files (not a symlink) +# This handles the case where `public/data` is a symlink in the repo, which +# would be broken inside the container after COPY. +RUN set -e; \ + if [ -e public ] && [ ! -d public ]; then rm -f public; fi; \ + mkdir -p public; \ + if [ -L public/data ] || { [ -e public/data ] && [ ! -d public/data ]; }; then rm -f public/data; fi; \ + mkdir -p public/data; \ + cp -a src/content/assets/data/. public/data/ + +# Build the application (with minimal placeholder content) +RUN npm run build + +# Generate the PDF (light theme, full wait) +RUN npm run export:pdf -- --theme=light --wait=full + +# Generate LaTeX export +RUN npm run export:latex + +# Install nginx in the build stage (we'll use this image as final to keep Node.js) +RUN apt-get update && apt-get install -y nginx && apt-get clean && rm -rf /var/lib/apt/lists/* + +# Copy nginx configuration +COPY nginx.conf /etc/nginx/nginx.conf + +# Copy entrypoint script +COPY entrypoint.sh /entrypoint.sh +RUN chmod +x /entrypoint.sh + +# Create necessary directories and set permissions +RUN mkdir -p /var/cache/nginx /var/run /var/log/nginx /var/lib/nginx/body && \ + chmod -R 777 /var/cache/nginx /var/run /var/log/nginx /var/lib/nginx /etc/nginx/nginx.conf && \ + chmod -R 777 /app + +# Expose port 8080 +EXPOSE 8080 + +# Use entrypoint script that handles Notion import if enabled +ENTRYPOINT ["/entrypoint.sh"] diff --git a/LICENSE b/LICENSE new file mode 100644 index 0000000000000000000000000000000000000000..b267a53137822114e4c0bcef2e6383aaf52a70f1 --- /dev/null +++ b/LICENSE @@ -0,0 +1,33 @@ +Creative Commons Attribution 4.0 International License + +Copyright (c) 2024 Thibaud Frere + +This work is licensed under the Creative Commons Attribution 4.0 International License. +To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ +or send a letter to Creative Commons, PO Box 1866, Mountain View, CA 94042, USA. + +You are free to: + + Share — copy and redistribute the material in any medium or format + Adapt — remix, transform, and build upon the material for any purpose, even commercially. + +The licensor cannot revoke these freedoms as long as you follow the license terms. + +Under the following terms: + + Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use. + + No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits. + +Notices: + + You do not have to comply with the license for elements of the material in the public domain or where your use is permitted by an applicable exception or limitation. + + No warranties are given. The license may not give you all of the permissions necessary for your intended use. For example, other rights such as publicity, privacy, or moral rights may limit how you use the material. + +--- + +For the source code and technical implementation: +- The source code is available at: https://huggingface.co/spaces/tfrere/research-article-template +- Third-party figures and assets are excluded from this license and marked in their captions +- Dependencies and third-party libraries maintain their respective licenses diff --git a/README.md b/README.md new file mode 100644 index 0000000000000000000000000000000000000000..ad659ab0aa4fea6aa3b78daecec5bff018f205c0 --- /dev/null +++ b/README.md @@ -0,0 +1,122 @@ +--- +title: 'Evaluation Guidebook' +short_desc: 'How to properly evaluate LLMs in the modern age' +emoji: 📝 +colorFrom: blue +colorTo: indigo +sdk: docker +pinned: false +header: mini +app_port: 8080 +tags: + - research-article-template + - research paper + - scientific paper + - data visualization +thumbnail: https://HuggingFaceTB-smol-training-playbook.hf.space/thumb.png +--- +
+ +# Research Article Template + +[![License: CC BY 4.0](https://img.shields.io/badge/License-CC%20BY%204.0-lightgrey.svg)](https://creativecommons.org/licenses/by/4.0/) +[![Node.js Version](https://img.shields.io/badge/node-%3E%3D20.0.0-brightgreen.svg)](https://nodejs.org/) +[![Astro](https://img.shields.io/badge/Astro-4.10.0-orange.svg)](https://astro.build/) +[![Hugging Face Spaces](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue)](https://huggingface.co/spaces/tfrere/research-article-template) + + +**A modern, interactive template for scientific writing** that brings papers to life with web-native features. The web offers what static PDFs can't: **interactive diagrams**, **progressive notation**, and **exploratory views** that show how ideas behave. This template treats interactive artifacts—figures, math, code, and inspectable experiments—as **first-class** alongside prose, helping readers **build intuition** instead of skimming results—all with **minimal setup** and no web knowledge required. + +**[Try the live demo & documentation →](https://huggingface.co/spaces/tfrere/research-article-template)** + +
+ +## 🚀 Quick Start + +### Option 1: Duplicate on Hugging Face (Recommended) + +1. Visit **[🤗 Research Article Template](https://huggingface.co/spaces/tfrere/research-article-template)** +2. Click **"Duplicate this Space"** +3. Clone your new repository: + ```bash + git clone git@hf.co:spaces// + cd + ``` + +### Option 2: Clone Directly + +```bash +git clone https://github.com/tfrere/research-article-template.git +cd research-article-template +``` + +### Installation + +```bash +# Install Node.js 20+ (use nvm for version management) +nvm install 20 +nvm use 20 + +# Install Git LFS and pull assets +git lfs install +git lfs pull + +# Install dependencies +cd app +npm install + +# Start development server +npm run dev +``` + +Visit `http://localhost:4321` to see your site! + +## 🎯 Who This Is For + +- **Scientists** writing modern, web-native research papers +- **Educators** creating interactive, explorable lessons +- **Researchers** who want to focus on ideas, not infrastructure +- **Anyone** who values clear, engaging technical communication + +## 🌟 Inspired by Distill + +This template carries forward the spirit of [Distill](https://distill.pub/) (2016–2021), pushing interactive scientific writing even further with: +- Accessible, high-quality explanations +- Reproducible, production-ready demos +- Modern web technologies and best practices + +## 🤝 Contributing + +We welcome contributions! Please see our [Contributing Guidelines](CONTRIBUTING.md) for details. + +### Ways to Contribute + +- **Report bugs** - Open an issue with detailed information +- **Suggest features** - Share ideas for improvements +- **Improve documentation** - Help others get started +- **Submit code** - Fix bugs or add features +- **Join discussions** - Share feedback and ideas + +## 📄 License + +This project is licensed under the [Creative Commons Attribution 4.0 International License](https://creativecommons.org/licenses/by/4.0/). + +- **Diagrams and text**: CC-BY 4.0 +- **Source code**: Available on [Hugging Face](https://huggingface.co/spaces/tfrere/research-article-template) +- **Third-party figures**: Excluded and marked in captions + +## 🙏 Acknowledgments + +- Inspired by [Distill](https://distill.pub/) and the interactive scientific writing movement +- Built with [Astro](https://astro.build/), [MDX](https://mdxjs.com/), and modern web technologies +- Community feedback and contributions from researchers worldwide + +## 📞 Support + +- **[Community Discussions](https://huggingface.co/spaces/tfrere/research-article-template/discussions)** - Ask questions and share ideas +- **[Report Issues](https://huggingface.co/spaces/tfrere/research-article-template/discussions?status=open&type=issue)** - Bug reports and feature requests +- **Contact**: [@tfrere](https://huggingface.co/tfrere) on Hugging Face + +--- + +**Made with ❤️ for the scientific community** \ No newline at end of file diff --git a/app/astro.config.mjs b/app/astro.config.mjs new file mode 100644 index 0000000000000000000000000000000000000000..9fb9cff3932e93e3d7e31db4cf75df045cbf821a --- /dev/null +++ b/app/astro.config.mjs @@ -0,0 +1,80 @@ +import { defineConfig } from 'astro/config'; +import mdx from '@astrojs/mdx'; +import svelte from '@astrojs/svelte'; +import mermaid from 'astro-mermaid'; +import compressor from 'astro-compressor'; +import remarkMath from 'remark-math'; +import rehypeKatex from 'rehype-katex'; +import remarkFootnotes from 'remark-footnotes'; +import rehypeSlug from 'rehype-slug'; +import rehypeAutolinkHeadings from 'rehype-autolink-headings'; +import rehypeCitation from 'rehype-citation'; +import rehypeCodeCopy from './plugins/rehype/code-copy.mjs'; +import rehypeReferencesAndFootnotes from './plugins/rehype/post-citation.mjs'; +import remarkIgnoreCitationsInCode from './plugins/remark/ignore-citations-in-code.mjs'; +import remarkUnwrapCitationLinks from './plugins/remark/unwrap-citation-links.mjs'; +import remarkDirective from 'remark-directive'; +import remarkOutputContainer from './plugins/remark/output-container.mjs'; +import rehypeRestoreAtInCode from './plugins/rehype/restore-at-in-code.mjs'; +import rehypeWrapTables from './plugins/rehype/wrap-tables.mjs'; +import rehypeWrapOutput from './plugins/rehype/wrap-outputs.mjs'; +// Built-in Shiki (dual themes) — no rehype-pretty-code + +// Plugins moved to app/plugins/* + +export default defineConfig({ + output: 'static', + integrations: [ + mermaid({ theme: 'neutral', autoTheme: true }), + mdx(), + svelte(), + // Precompress output with Gzip only (Brotli disabled due to server module mismatch) + compressor({ brotli: false, gzip: true }) + ], + devToolbar: { + enabled: false + }, + markdown: { + shikiConfig: { + themes: { + light: 'github-light', + dark: 'github-dark' + }, + defaultColor: false, + wrap: false, + langAlias: { + // Map MDX fences to TSX for better JSX tokenization + mdx: 'tsx' + } + }, + remarkPlugins: [ + remarkUnwrapCitationLinks, + remarkIgnoreCitationsInCode, + remarkMath, + [remarkFootnotes, { inlineNotes: true }], + remarkDirective, + remarkOutputContainer + ], + rehypePlugins: [ + rehypeSlug, + [rehypeAutolinkHeadings, { behavior: 'wrap' }], + [rehypeKatex, { + trust: true, + }], + [rehypeCitation, { + bibliography: 'src/content/bibliography.bib', + linkCitations: true, + csl: "apa", + noCite: false, + suppressBibliography: false, + }], + rehypeReferencesAndFootnotes, + rehypeRestoreAtInCode, + rehypeCodeCopy, + rehypeWrapOutput, + rehypeWrapTables + ] + } +}); + + diff --git a/app/package-lock.json b/app/package-lock.json new file mode 100644 index 0000000000000000000000000000000000000000..53ec7ac7a928402ea43e9b4e308dd16483ac53c2 Binary files /dev/null and b/app/package-lock.json differ diff --git a/app/package.json b/app/package.json new file mode 100644 index 0000000000000000000000000000000000000000..473e15216e9e41f2bd6881b6cc5f2470acee71fe Binary files /dev/null and b/app/package.json differ diff --git a/app/plugins/rehype/code-copy.mjs b/app/plugins/rehype/code-copy.mjs new file mode 100644 index 0000000000000000000000000000000000000000..29b135ee039c2af2f468bc836874f55a0a78ca17 --- /dev/null +++ b/app/plugins/rehype/code-copy.mjs @@ -0,0 +1,94 @@ +// Minimal rehype plugin to wrap code blocks with a copy button +// Exported as a standalone module to keep astro.config.mjs lean +export default function rehypeCodeCopy() { + return (tree) => { + // Walk the tree; lightweight visitor to find

+    const visit = (node, parent) => {
+      if (!node || typeof node !== 'object') return;
+      const children = Array.isArray(node.children) ? node.children : [];
+      if (node.tagName === 'pre' && children.some(c => c.tagName === 'code')) {
+        // Find code child
+        const code = children.find(c => c.tagName === 'code');
+        // Determine if single-line block: prefer Shiki lines, then text content
+        const countLinesFromShiki = () => {
+          const isLineEl = (el) => el && el.type === 'element' && el.tagName === 'span' && Array.isArray(el.properties?.className) && el.properties.className.includes('line');
+          const hasNonWhitespaceText = (node) => {
+            if (!node) return false;
+            if (node.type === 'text') return /\S/.test(String(node.value || ''));
+            const kids = Array.isArray(node.children) ? node.children : [];
+            return kids.some(hasNonWhitespaceText);
+          };
+          const collectLines = (node, acc) => {
+            if (!node || typeof node !== 'object') return;
+            if (isLineEl(node)) acc.push(node);
+            const kids = Array.isArray(node.children) ? node.children : [];
+            kids.forEach((k) => collectLines(k, acc));
+          };
+          const lines = [];
+          collectLines(code, lines);
+          const nonEmpty = lines.filter((ln) => hasNonWhitespaceText(ln)).length;
+          return nonEmpty || 0;
+        };
+        const countLinesFromText = () => {
+          // Parse raw text content of the  node including nested spans
+          const extractText = (node) => {
+            if (!node) return '';
+            if (node.type === 'text') return String(node.value || '');
+            const kids = Array.isArray(node.children) ? node.children : [];
+            return kids.map(extractText).join('');
+          };
+          const raw = extractText(code);
+          if (!raw || !/\S/.test(raw)) return 0;
+          return raw.split('\n').filter(line => /\S/.test(line)).length;
+        };
+        const lines = countLinesFromShiki() || countLinesFromText();
+        const isSingleLine = lines <= 1;
+        // Also treat code blocks shorter than a threshold as single-line (defensive)
+        if (!isSingleLine) {
+          const approxChars = (() => {
+            const extract = (n) => Array.isArray(n?.children) ? n.children.map(extract).join('') : (n?.type === 'text' ? String(n.value||'') : '');
+            return extract(code).length;
+          })();
+          if (approxChars < 6) {
+            node.__forceSingle = true;
+          }
+        }
+        // Replace 
 with wrapper div.code-card containing button + pre
+        const wrapper = {
+          type: 'element',
+          tagName: 'div',
+          properties: { className: ['code-card'].concat((isSingleLine || node.__forceSingle) ? ['no-copy'] : []) },
+          children: (isSingleLine || node.__forceSingle) ? [ node ] : [
+            {
+              type: 'element',
+              tagName: 'button',
+              properties: { className: ['code-copy', 'button--ghost'], type: 'button', 'aria-label': 'Copy code' },
+              children: [
+                {
+                  type: 'element',
+                  tagName: 'svg',
+                  properties: { viewBox: '0 0 24 24', 'aria-hidden': 'true', focusable: 'false' },
+                  children: [
+                    { type: 'element', tagName: 'path', properties: { d: 'M16 1H4c-1.1 0-2 .9-2 2v12h2V3h12V1zm3 4H8c-1.1 0-2 .9-2 2v14c0 1.1.9 2 2 2h11c1.1 0 2-.9 2-2V7c0-1.1-.9-2-2-2zm0 16H8V7h11v14z' }, children: [] }
+                  ]
+                }
+              ]
+            },
+            node
+          ]
+        };
+        if (parent && Array.isArray(parent.children)) {
+          const idx = parent.children.indexOf(node);
+          if (idx !== -1) parent.children[idx] = wrapper;
+        }
+        return; // don't visit nested
+      }
+      children.forEach((c) => visit(c, node));
+    };
+    visit(tree, null);
+  };
+}
+
+
+
+
diff --git a/app/plugins/rehype/post-citation.mjs b/app/plugins/rehype/post-citation.mjs
new file mode 100644
index 0000000000000000000000000000000000000000..b91ed218aab6ab5f7244d8c74f25b49378e219b6
--- /dev/null
+++ b/app/plugins/rehype/post-citation.mjs
@@ -0,0 +1,493 @@
+// rehype plugin to post-process citations and footnotes at build-time
+// - Normalizes the bibliography into 
    with
  1. +// - Linkifies DOI/URL occurrences inside references +// - Appends back-reference links (↩ back: 1, 2, ...) from each reference to in-text citation anchors +// - Cleans up footnotes block (.footnotes) + +export default function rehypeReferencesAndFootnotes() { + return (tree) => { + const isElement = (n) => n && typeof n === 'object' && n.type === 'element'; + const getChildren = (n) => (Array.isArray(n?.children) ? n.children : []); + + const walk = (node, parent, fn) => { + if (!node || typeof node !== 'object') return; + fn && fn(node, parent); + const kids = getChildren(node); + for (const child of kids) walk(child, node, fn); + }; + + const ensureArray = (v) => (Array.isArray(v) ? v : v != null ? [v] : []); + + const hasClass = (el, name) => { + const cn = ensureArray(el?.properties?.className).map(String); + return cn.includes(name); + }; + + const setAttr = (el, key, val) => { + el.properties = el.properties || {}; + if (val == null) delete el.properties[key]; + else el.properties[key] = val; + }; + + const getAttr = (el, key) => (el?.properties ? el.properties[key] : undefined); + + // Shared helpers for backlinks + backrefs block + const collectBacklinksForIdSet = (idSet, anchorPrefix) => { + const idToBacklinks = new Map(); + const idToAnchorNodes = new Map(); + if (!idSet || idSet.size === 0) return { idToBacklinks, idToAnchorNodes }; + walk(tree, null, (node) => { + if (!isElement(node) || node.tagName !== 'a') return; + const href = String(getAttr(node, 'href') || ''); + if (!href.startsWith('#')) return; + const id = href.slice(1); + if (!idSet.has(id)) return; + // Ensure a stable id + let anchorId = String(getAttr(node, 'id') || ''); + if (!anchorId) { + const list = idToBacklinks.get(id) || []; + anchorId = `${anchorPrefix}-${id}-${list.length + 1}`; + setAttr(node, 'id', anchorId); + } + const list = idToBacklinks.get(id) || []; + list.push(anchorId); + idToBacklinks.set(id, list); + const nodes = idToAnchorNodes.get(id) || []; + nodes.push(node); + idToAnchorNodes.set(id, nodes); + }); + return { idToBacklinks, idToAnchorNodes }; + }; + + const createBackIcon = () => ({ + type: 'element', + tagName: 'svg', + properties: { + className: ['back-icon'], + width: 12, + height: 12, + viewBox: '0 0 24 24', + fill: 'none', + stroke: 'currentColor', + 'stroke-width': 2, + 'stroke-linecap': 'round', + 'stroke-linejoin': 'round', + 'aria-hidden': 'true', + focusable: 'false' + }, + children: [ + { type: 'element', tagName: 'line', properties: { x1: 12, y1: 19, x2: 12, y2: 5 }, children: [] }, + { type: 'element', tagName: 'polyline', properties: { points: '5 12 12 5 19 12' }, children: [] } + ] + }); + + const appendBackrefsBlock = (listElement, idToBacklinks, ariaLabel) => { + if (!listElement || !idToBacklinks || idToBacklinks.size === 0) return; + for (const li of getChildren(listElement)) { + if (!isElement(li) || li.tagName !== 'li') continue; + const id = String(getAttr(li, 'id') || ''); + if (!id) continue; + const keys = idToBacklinks.get(id); + if (!keys || !keys.length) continue; + // Remove pre-existing .backrefs in this li to avoid duplicates + li.children = getChildren(li).filter((n) => !(isElement(n) && n.tagName === 'small' && hasClass(n, 'backrefs'))); + const small = { + type: 'element', + tagName: 'small', + properties: { className: ['backrefs'] }, + children: [] + }; + if (keys.length === 1) { + // Single backlink: just the icon wrapped in the anchor + const a = { + type: 'element', + tagName: 'a', + properties: { href: `#${keys[0]}`, 'aria-label': ariaLabel }, + children: [createBackIcon()] + }; + small.children.push(a); + } else { + // Multiple backlinks: icon + label + numbered links + small.children.push(createBackIcon()); + small.children.push({ type: 'text', value: ' back: ' }); + keys.forEach((backId, idx) => { + small.children.push({ + type: 'element', + tagName: 'a', + properties: { href: `#${backId}`, 'aria-label': ariaLabel }, + children: [{ type: 'text', value: String(idx + 1) }] + }); + if (idx < keys.length - 1) small.children.push({ type: 'text', value: ', ' }); + }); + } + li.children.push(small); + } + }; + // Remove default back-reference anchors generated by remark-footnotes inside a footnote item + const getTextContent = (el) => { + if (!el) return ''; + const stack = [el]; + let out = ''; + while (stack.length) { + const cur = stack.pop(); + if (!cur) continue; + if (cur.type === 'text') out += String(cur.value || ''); + const kids = getChildren(cur); + for (let i = kids.length - 1; i >= 0; i--) stack.push(kids[i]); + } + return out; + }; + + // Check if an element is part of KaTeX structure + const isKaTeXElement = (el) => { + if (!isElement(el)) return false; + const className = ensureArray(getAttr(el, 'className') || []).map(String); + // Check for KaTeX classes + if (className.some(c => c.includes('katex') || c.includes('math'))) return true; + // Check parent chain for KaTeX + let current = el; + for (let depth = 0; depth < 10; depth++) { + // We need to walk up, but we don't have parent references in rehype AST + // So check by tagName and common KaTeX patterns + const tag = String(current.tagName || '').toLowerCase(); + if (tag === 'math' || className.some(c => c.includes('katex'))) return true; + break; // Can't walk up in AST, just check current element + } + return false; + }; + + const removeFootnoteBackrefAnchors = (el) => { + if (!isElement(el)) return; + // Never modify KaTeX elements or their contents + if (isKaTeXElement(el)) return; + + const kids = getChildren(el); + for (let i = kids.length - 1; i >= 0; i--) { + const child = kids[i]; + if (isElement(child)) { + // Never touch KaTeX elements + if (isKaTeXElement(child)) continue; + + if ( + child.tagName === 'a' && ( + getAttr(child, 'data-footnote-backref') != null || + hasClass(child, 'footnote-backref') || + String(getAttr(child, 'role') || '').toLowerCase() === 'doc-backlink' || + String(getAttr(child, 'aria-label') || '').toLowerCase().includes('back to content') || + String(getAttr(child, 'href') || '').startsWith('#fnref') || + // Fallback: text-based detection like "↩" or "↩2" + /^\s*↩\s*\d*\s*$/u.test(getTextContent(child)) + ) + ) { + // Remove the anchor + el.children.splice(i, 1); + continue; + } + // Recurse into element (but not if it's KaTeX) + removeFootnoteBackrefAnchors(child); + // If a wrapper like or became empty, remove it + // BUT only if it's not part of KaTeX + const becameKids = getChildren(child); + if ((child.tagName === 'sup' || child.tagName === 'span') && + (!becameKids || becameKids.length === 0) && + !isKaTeXElement(child)) { + el.children.splice(i, 1); + } + } + } + }; + + + const normDoiHref = (href) => { + if (!href) return href; + const DUP = /https?:\/\/(?:dx\.)?doi\.org\/(?:https?:\/\/(?:dx\.)?doi\.org\/)+/gi; + const ONE = /https?:\/\/(?:dx\.)?doi\.org\/(10\.[^\s<>"']+)/i; + href = String(href).replace(DUP, 'https://doi.org/'); + const m = href.match(ONE); + return m ? `https://doi.org/${m[1]}` : href; + }; + + const DOI_BARE = /\b10\.[0-9]{4,9}\/[\-._;()\/:A-Z0-9]+\b/gi; + const URL_GEN = /\bhttps?:\/\/[^\s<>()"']+/gi; + + const linkifyTextNode = (textNode) => { + const text = String(textNode.value || ''); + let last = 0; + const parts = []; + const pushText = (s) => { if (s) parts.push({ type: 'text', value: s }); }; + + const matches = []; + // Collect URL matches + let m; + URL_GEN.lastIndex = 0; + while ((m = URL_GEN.exec(text)) !== null) { + matches.push({ type: 'url', start: m.index, end: URL_GEN.lastIndex, raw: m[0] }); + } + // Collect DOI matches + DOI_BARE.lastIndex = 0; + while ((m = DOI_BARE.exec(text)) !== null) { + matches.push({ type: 'doi', start: m.index, end: DOI_BARE.lastIndex, raw: m[0] }); + } + matches.sort((a, b) => a.start - b.start); + + for (const match of matches) { + if (match.start < last) continue; // overlapping + pushText(text.slice(last, match.start)); + if (match.type === 'url') { + const href = normDoiHref(match.raw); + const doiOne = href.match(/https?:\/\/(?:dx\.)?doi\.org\/(10\.[^\s<>"']+)/i); + const a = { + type: 'element', + tagName: 'a', + properties: { href, target: '_blank', rel: 'noopener noreferrer' }, + children: [{ type: 'text', value: doiOne ? doiOne[1] : href }] + }; + parts.push(a); + } else { + const href = `https://doi.org/${match.raw}`; + const a = { + type: 'element', + tagName: 'a', + properties: { href, target: '_blank', rel: 'noopener noreferrer' }, + children: [{ type: 'text', value: match.raw }] + }; + parts.push(a); + } + last = match.end; + } + + pushText(text.slice(last)); + return parts; + }; + + const linkifyInElement = (el) => { + const kids = getChildren(el); + for (let i = 0; i < kids.length; i++) { + const child = kids[i]; + if (!child) continue; + if (child.type === 'text') { + const replacement = linkifyTextNode(child); + if (replacement.length === 1 && replacement[0].type === 'text') continue; + // Replace the single text node with multiple nodes + el.children.splice(i, 1, ...replacement); + i += replacement.length - 1; + } else if (isElement(child)) { + if (child.tagName === 'a') { + const href = normDoiHref(getAttr(child, 'href')); + setAttr(child, 'href', href); + const m = String(href || '').match(/https?:\/\/(?:dx\.)?doi\.org\/(10\.[^\s<>"']+)/i); + if (m && (!child.children || child.children.length === 0)) { + child.children = [{ type: 'text', value: m[1] }]; + } + continue; + } + linkifyInElement(child); + } + } + // Deduplicate adjacent identical anchors + for (let i = 1; i < el.children.length; i++) { + const prev = el.children[i - 1]; + const curr = el.children[i]; + if (isElement(prev) && isElement(curr) && prev.tagName === 'a' && curr.tagName === 'a') { + const key = `${getAttr(prev, 'href') || ''}|${(prev.children?.[0]?.value) || ''}`; + const key2 = `${getAttr(curr, 'href') || ''}|${(curr.children?.[0]?.value) || ''}`; + if (key === key2) { + el.children.splice(i, 1); + i--; + } + } + } + }; + + // Find references container and normalize its list + const findReferencesRoot = () => { + let found = null; + walk(tree, null, (node) => { + if (found) return; + if (!isElement(node)) return; + const id = getAttr(node, 'id'); + if (id === 'references' || hasClass(node, 'references') || hasClass(node, 'bibliography')) { + found = node; + } + }); + return found; + }; + + const toOrderedList = (container) => { + // If there is already an
      , use it; otherwise convert common structures + let ol = getChildren(container).find((c) => isElement(c) && c.tagName === 'ol'); + if (!ol) { + ol = { type: 'element', tagName: 'ol', properties: { className: ['references'] }, children: [] }; + const candidates = getChildren(container).filter((n) => isElement(n)); + if (candidates.length) { + for (const node of candidates) { + if (hasClass(node, 'csl-entry') || node.tagName === 'li' || node.tagName === 'p' || node.tagName === 'div') { + const li = { type: 'element', tagName: 'li', properties: {}, children: getChildren(node) }; + if (getAttr(node, 'id')) setAttr(li, 'id', getAttr(node, 'id')); + ol.children.push(li); + } + } + } + // Replace container children by the new ol + container.children = [ol]; + } + if (!hasClass(ol, 'references')) { + const cls = ensureArray(ol.properties?.className).map(String); + if (!cls.includes('references')) cls.push('references'); + ol.properties = ol.properties || {}; + ol.properties.className = cls; + } + return ol; + }; + + const refsRoot = findReferencesRoot(); + let refsOl = null; + const refIdSet = new Set(); + const refIdToExternalHref = new Map(); + + if (refsRoot) { + refsOl = toOrderedList(refsRoot); + // Collect item ids and linkify their content + for (const li of getChildren(refsOl)) { + if (!isElement(li) || li.tagName !== 'li') continue; + if (!getAttr(li, 'id')) { + // Try to find a nested element with id to promote + const nestedWithId = getChildren(li).find((n) => isElement(n) && getAttr(n, 'id')); + if (nestedWithId) setAttr(li, 'id', getAttr(nestedWithId, 'id')); + } + const id = getAttr(li, 'id'); + if (id) refIdSet.add(String(id)); + linkifyInElement(li); + // Record first external link href (e.g., DOI/URL) if present + if (id) { + let externalHref = null; + const stack = [li]; + while (stack.length) { + const cur = stack.pop(); + const kids = getChildren(cur); + for (const k of kids) { + if (isElement(k) && k.tagName === 'a') { + const href = String(getAttr(k, 'href') || ''); + if (/^https?:\/\//i.test(href)) { + externalHref = href; + break; + } + } + if (isElement(k)) stack.push(k); + } + if (externalHref) break; + } + if (externalHref) refIdToExternalHref.set(String(id), externalHref); + } + } + setAttr(refsRoot, 'data-built-refs', '1'); + } + + // Collect in-text anchors that point to references ids + const { idToBacklinks: refIdToBacklinks, idToAnchorNodes: refIdToCitationAnchors } = collectBacklinksForIdSet(refIdSet, 'refctx'); + + // Append backlinks into references list items + appendBackrefsBlock(refsOl, refIdToBacklinks, 'Back to citation'); + + // Rewrite in-text citation anchors to external link when available + if (refIdToCitationAnchors.size > 0) { + for (const [id, anchors] of refIdToCitationAnchors.entries()) { + const ext = refIdToExternalHref.get(id); + if (!ext) continue; + for (const a of anchors) { + setAttr(a, 'data-ref-id', id); + setAttr(a, 'href', ext); + const existingTarget = getAttr(a, 'target'); + if (!existingTarget) setAttr(a, 'target', '_blank'); + const rel = String(getAttr(a, 'rel') || ''); + const relSet = new Set(rel ? rel.split(/\s+/) : []); + relSet.add('noopener'); + relSet.add('noreferrer'); + setAttr(a, 'rel', Array.from(relSet).join(' ')); + } + } + } + + // Deep clone a node and all its children (preserve KaTeX structure) + const deepCloneNode = (node) => { + if (!node || typeof node !== 'object') return node; + if (node.type === 'text') { + return { type: 'text', value: node.value }; + } + if (node.type === 'element') { + const cloned = { + type: 'element', + tagName: node.tagName, + properties: node.properties ? JSON.parse(JSON.stringify(node.properties)) : {}, + children: [] + }; + const kids = getChildren(node); + for (const child of kids) { + cloned.children.push(deepCloneNode(child)); + } + return cloned; + } + return node; + }; + + // Footnotes cleanup + backrefs harmonized with references + const cleanupFootnotes = () => { + let root = null; + walk(tree, null, (node) => { + if (!isElement(node)) return; + if (hasClass(node, 'footnotes')) root = node; + }); + if (!root) return { root: null, ol: null, idSet: new Set() }; + // Remove
      direct children + root.children = getChildren(root).filter((n) => !(isElement(n) && n.tagName === 'hr')); + // Ensure an
        + let ol = getChildren(root).find((c) => isElement(c) && c.tagName === 'ol'); + if (!ol) { + ol = { type: 'element', tagName: 'ol', properties: {}, children: [] }; + const items = getChildren(root).filter((n) => isElement(n) && (n.tagName === 'li' || hasClass(n, 'footnote') || n.tagName === 'p' || n.tagName === 'div')); + if (items.length) { + for (const it of items) { + // Deep clone to preserve all properties including KaTeX structure + const clonedChildren = getChildren(it).map(deepCloneNode); + const li = { type: 'element', tagName: 'li', properties: {}, children: clonedChildren }; + // Promote nested id if present (e.g.,

        ) + const nestedWithId = getChildren(it).find((n) => isElement(n) && getAttr(n, 'id')); + if (nestedWithId) setAttr(li, 'id', getAttr(nestedWithId, 'id')); + ol.children.push(li); + } + } + root.children = [ol]; + } + // For existing structures, try to promote ids from children when missing + for (const li of getChildren(ol)) { + if (!isElement(li) || li.tagName !== 'li') continue; + if (!getAttr(li, 'id')) { + const nestedWithId = getChildren(li).find((n) => isElement(n) && getAttr(n, 'id')); + if (nestedWithId) setAttr(li, 'id', getAttr(nestedWithId, 'id')); + } + // Remove default footnote backrefs anywhere inside (to avoid duplication) + // But preserve KaTeX elements + removeFootnoteBackrefAnchors(li); + } + setAttr(root, 'data-built-footnotes', '1'); + // Collect id set + const idSet = new Set(); + for (const li of getChildren(ol)) { + if (!isElement(li) || li.tagName !== 'li') continue; + const id = getAttr(li, 'id'); + if (id) idSet.add(String(id)); + } + return { root, ol, idSet }; + }; + + const { root: footRoot, ol: footOl, idSet: footIdSet } = cleanupFootnotes(); + + // Collect in-text anchors pointing to footnotes + const { idToBacklinks: footIdToBacklinks } = collectBacklinksForIdSet(footIdSet, 'footctx'); + + // Append backlinks into footnote list items (identical pattern to references) + appendBackrefsBlock(footOl, footIdToBacklinks, 'Back to footnote call'); + }; +} + + diff --git a/app/plugins/rehype/restore-at-in-code.mjs b/app/plugins/rehype/restore-at-in-code.mjs new file mode 100644 index 0000000000000000000000000000000000000000..09db2b1fb8720cefeb7a7d94ea85ba4db47b1612 --- /dev/null +++ b/app/plugins/rehype/restore-at-in-code.mjs @@ -0,0 +1,22 @@ +// Rehype plugin to restore '@' inside code nodes after rehype-citation ran +export default function rehypeRestoreAtInCode() { + return (tree) => { + const restoreInNode = (node) => { + if (!node || typeof node !== 'object') return; + const isText = node.type === 'text'; + if (isText && typeof node.value === 'string' && node.value.includes('__AT_SENTINEL__')) { + node.value = node.value.replace(/__AT_SENTINEL__/g, '@'); + } + const isCodeEl = node.type === 'element' && node.tagName === 'code'; + const children = Array.isArray(node.children) ? node.children : []; + if (isCodeEl && children.length) { + children.forEach(restoreInNode); + return; + } + children.forEach(restoreInNode); + }; + restoreInNode(tree); + }; +} + + diff --git a/app/plugins/rehype/wrap-outputs.mjs b/app/plugins/rehype/wrap-outputs.mjs new file mode 100644 index 0000000000000000000000000000000000000000..307047febe085ffa78f2468978e588bc3749b148 --- /dev/null +++ b/app/plugins/rehype/wrap-outputs.mjs @@ -0,0 +1,38 @@ +// Wrap plain-text content inside

        into a
        +export default function rehypeWrapOutput() {
        +  return (tree) => {
        +    const isWhitespace = (value) => typeof value === 'string' && !/\S/.test(value);
        +    const extractText = (node) => {
        +      if (!node) return '';
        +      if (node.type === 'text') return String(node.value || '');
        +      const kids = Array.isArray(node.children) ? node.children : [];
        +      return kids.map(extractText).join('');
        +    };
        +    const visit = (node) => {
        +      if (!node || typeof node !== 'object') return;
        +      const children = Array.isArray(node.children) ? node.children : [];
        +      if (node.type === 'element' && node.tagName === 'section') {
        +        const className = node.properties?.className || [];
        +        const classes = Array.isArray(className) ? className : [className].filter(Boolean);
        +        if (classes.includes('code-output')) {
        +          const meaningful = children.filter((c) => !(c.type === 'text' && isWhitespace(c.value)));
        +          if (meaningful.length === 1) {
        +            const only = meaningful[0];
        +            const isPlainParagraph = only.type === 'element' && only.tagName === 'p' && (only.children || []).every((c) => c.type === 'text');
        +            const isPlainText = only.type === 'text';
        +            if (isPlainParagraph || isPlainText) {
        +              const text = isPlainText ? String(only.value || '') : extractText(only);
        +              node.children = [
        +                { type: 'element', tagName: 'pre', properties: {}, children: [ { type: 'text', value: text } ] }
        +              ];
        +            }
        +          }
        +        }
        +      }
        +      children.forEach(visit);
        +    };
        +    visit(tree);
        +  };
        +}
        +
        +
        diff --git a/app/plugins/rehype/wrap-tables.mjs b/app/plugins/rehype/wrap-tables.mjs
        new file mode 100644
        index 0000000000000000000000000000000000000000..fc7944cb737ba8cfd2cbed28b66e2527c0234f89
        --- /dev/null
        +++ b/app/plugins/rehype/wrap-tables.mjs
        @@ -0,0 +1,43 @@
        +// rehype plugin: wrap bare  elements in a 
        container +// so that tables stay width:100% while enabling horizontal scroll when content overflows + +export default function rehypeWrapTables() { + return (tree) => { + const isElement = (n) => n && typeof n === 'object' && n.type === 'element'; + const getChildren = (n) => (Array.isArray(n?.children) ? n.children : []); + + const walk = (node, parent, fn) => { + if (!node || typeof node !== 'object') return; + fn && fn(node, parent); + const kids = getChildren(node); + for (const child of kids) walk(child, node, fn); + }; + + const ensureArray = (v) => (Array.isArray(v) ? v : v != null ? [v] : []); + const hasClass = (el, name) => ensureArray(el?.properties?.className).map(String).includes(name); + + const wrapTable = (tableNode, parent) => { + if (!parent || !Array.isArray(parent.children)) return; + // Don't double-wrap if already inside .table-scroll + if (parent.tagName === 'div' && hasClass(parent, 'table-scroll')) return; + + const wrapper = { + type: 'element', + tagName: 'div', + properties: { className: ['table-scroll'] }, + children: [tableNode] + }; + + const idx = parent.children.indexOf(tableNode); + if (idx >= 0) parent.children.splice(idx, 1, wrapper); + }; + + walk(tree, null, (node, parent) => { + if (!isElement(node)) return; + if (node.tagName !== 'table') return; + wrapTable(node, parent); + }); + }; +} + + diff --git a/app/plugins/remark/ignore-citations-in-code.mjs b/app/plugins/remark/ignore-citations-in-code.mjs new file mode 100644 index 0000000000000000000000000000000000000000..b5c3e279088bcbd325bdb2d031de77ed48fa5591 --- /dev/null +++ b/app/plugins/remark/ignore-citations-in-code.mjs @@ -0,0 +1,21 @@ +// Remark plugin to ignore citations inside code (block and inline) +export default function remarkIgnoreCitationsInCode() { + return (tree) => { + const visit = (node) => { + if (!node || typeof node !== 'object') return; + const type = node.type; + if (type === 'code' || type === 'inlineCode') { + if (typeof node.value === 'string' && node.value.includes('@')) { + // Use a sentinel to avoid rehype-citation, will be restored later in rehype + node.value = node.value.replace(/@/g, '__AT_SENTINEL__'); + } + return; // do not traverse into code + } + const children = Array.isArray(node.children) ? node.children : []; + children.forEach(visit); + }; + visit(tree); + }; +} + + diff --git a/app/plugins/remark/output-container.mjs b/app/plugins/remark/output-container.mjs new file mode 100644 index 0000000000000000000000000000000000000000..bb25220416a44e22007345265acb8d2eb803e93b --- /dev/null +++ b/app/plugins/remark/output-container.mjs @@ -0,0 +1,23 @@ +// Transform `:::output ... :::` into a
        wrapper +// Requires remark-directive to be applied before this plugin + +export default function remarkOutputContainer() { + return (tree) => { + const visit = (node) => { + if (!node || typeof node !== 'object') return; + + if (node.type === 'containerDirective' && node.name === 'output') { + node.data = node.data || {}; + node.data.hName = 'section'; + node.data.hProperties = { className: ['code-output'] }; + } + + const children = Array.isArray(node.children) ? node.children : []; + for (const child of children) visit(child); + }; + + visit(tree); + }; +} + + diff --git a/app/plugins/remark/outputs-container.mjs b/app/plugins/remark/outputs-container.mjs new file mode 100644 index 0000000000000000000000000000000000000000..5602aca8e635e00de98f49704be7e51e4f3e87b0 --- /dev/null +++ b/app/plugins/remark/outputs-container.mjs @@ -0,0 +1,23 @@ +// Transform `:::outputs ... :::` into a
        wrapper +// Requires remark-directive to be applied before this plugin + +export default function remarkOutputsContainer() { + return (tree) => { + const visit = (node) => { + if (!node || typeof node !== 'object') return; + + if (node.type === 'containerDirective' && node.name === 'outputs') { + node.data = node.data || {}; + node.data.hName = 'section'; + node.data.hProperties = { className: ['code-outputs'] }; + } + + const children = Array.isArray(node.children) ? node.children : []; + for (const child of children) visit(child); + }; + + visit(tree); + }; +} + + diff --git a/app/plugins/remark/unwrap-citation-links.mjs b/app/plugins/remark/unwrap-citation-links.mjs new file mode 100644 index 0000000000000000000000000000000000000000..89afd8d9b63d311aa6642a231741e8b219a6a962 --- /dev/null +++ b/app/plugins/remark/unwrap-citation-links.mjs @@ -0,0 +1,57 @@ +// Plugin remark pour transformer les liens markdown contenant des citations en citations simples +// Transforme [@reference](url) en [@reference] +export default function remarkUnwrapCitationLinks() { + return (tree) => { + // Fonction helper pour extraire le contenu textuel d'un nœud + const getTextContent = (node) => { + if (!node) return ''; + if (node.type === 'text') return node.value || ''; + if (Array.isArray(node.children)) { + return node.children.map(getTextContent).join(''); + } + return ''; + }; + + const visit = (node, parent) => { + if (!node || typeof node !== 'object') return; + + // Parcourir les enfants d'abord (post-order traversal) + const children = Array.isArray(node.children) ? node.children : []; + for (let i = 0; i < children.length; i++) { + const child = children[i]; + visit(child, node); + } + + // Si c'est un nœud de type 'link', vérifier son contenu + if (node.type === 'link' && parent && Array.isArray(parent.children)) { + // Récupérer le contenu textuel du lien + const textContent = getTextContent(node); + + // Debug + console.log('🔍 Link trouvé:', { + text: textContent, + url: node.url, + matches: /^@\w+/.test(textContent.trim()) + }); + + // Vérifier si c'est une citation (commence par @) + if (textContent && /^@\w+/.test(textContent.trim())) { + // Trouver l'index du nœud dans son parent + const index = parent.children.indexOf(node); + + if (index !== -1) { + console.log('✅ Transformation:', textContent); + // Remplacer le nœud link par un nœud text simple + parent.children[index] = { + type: 'text', + value: textContent.trim() + }; + } + } + } + }; + + visit(tree, null); + }; +} + diff --git a/app/postcss.config.mjs b/app/postcss.config.mjs new file mode 100644 index 0000000000000000000000000000000000000000..65fe6e9fd4437c66b3b2e303bd091a66cff025e5 --- /dev/null +++ b/app/postcss.config.mjs @@ -0,0 +1,14 @@ +// PostCSS config enabling Custom Media Queries +// Allows usage of: @media (--bp-content-collapse) { ... } + +import postcssCustomMedia from 'postcss-custom-media'; +import postcssPresetEnv from 'postcss-preset-env'; + +export default { + plugins: [ + postcssCustomMedia(), + postcssPresetEnv({ + stage: 0 + }) + ] +}; diff --git a/app/public/data b/app/public/data new file mode 120000 index 0000000000000000000000000000000000000000..7af5c0541877d3e5fd06c4a0bf6f8ffa18d2739a --- /dev/null +++ b/app/public/data @@ -0,0 +1 @@ +../src/content/assets/data \ No newline at end of file diff --git a/app/public/finetasks b/app/public/finetasks new file mode 120000 index 0000000000000000000000000000000000000000..c3b2304fb916b12e6cfc742941f9cb53d8c059e0 --- /dev/null +++ b/app/public/finetasks @@ -0,0 +1 @@ +../src/content/assets/finetasks \ No newline at end of file diff --git a/app/public/hf-space-parent-listener.js b/app/public/hf-space-parent-listener.js new file mode 100644 index 0000000000000000000000000000000000000000..d114abdeef1c38e61884fd16d09e9c757f454461 --- /dev/null +++ b/app/public/hf-space-parent-listener.js @@ -0,0 +1,55 @@ +/** + * Script for Hugging Face Spaces parent window + * This script listens to iframe messages and updates the parent window URL + * + * Usage instructions: + * 1. Add this script to your Hugging Face Space in app.py or in a Gradio component + * 2. Or use it in an HTML page that contains your iframe + */ + +(function () { + 'use strict'; + + // Listen to iframe messages + window.addEventListener('message', function (event) { + + // Check message type + if (event.data && event.data.type) { + switch (event.data.type) { + case 'urlChange': + case 'anchorChange': + case 'HF_SPACE_URL_UPDATE': + handleUrlChange(event.data); + break; + default: + // Unknown message type, ignore + } + } + }); + + function handleUrlChange(data) { + try { + const hash = data.hash || data.anchorId; + + if (hash) { + // Update URL with new anchor + const newUrl = new URL(window.location); + newUrl.hash = hash; + + // Use replaceState to avoid adding an entry to history + window.history.replaceState(null, '', newUrl.toString()); + } + } catch (error) { + // Silent error when updating URL + } + } + + // Utility function to test communication + window.testIframeCommunication = function () { + const iframe = document.querySelector('iframe'); + if (iframe) { + iframe.contentWindow.postMessage({ type: 'test' }, '*'); + } + }; + +})(); diff --git a/app/public/scripts/color-palettes.js b/app/public/scripts/color-palettes.js new file mode 100644 index 0000000000000000000000000000000000000000..370b1f464142e0d9280855b18f8f636db810ea6e --- /dev/null +++ b/app/public/scripts/color-palettes.js @@ -0,0 +1,274 @@ +// Global color palettes generator and watcher +// - Observes CSS variable --primary-color and theme changes +// - Generates categorical, sequential, and diverging palettes (OKLCH/OKLab) +// - Exposes results as CSS variables on :root +// - Supports variable color counts per palette via CSS vars +// - Dispatches a 'palettes:updated' CustomEvent after each update + +(() => { + const MODE = { cssRoot: document.documentElement }; + + const getCssVar = (name) => { + try { return getComputedStyle(MODE.cssRoot).getPropertyValue(name).trim(); } catch { return ''; } + }; + const getIntFromCssVar = (name, fallback) => { + const raw = getCssVar(name); + if (!raw) return fallback; + const v = parseInt(String(raw), 10); + if (Number.isNaN(v)) return fallback; + return v; + }; + const clamp = (n, min, max) => Math.max(min, Math.min(max, n)); + + // Color math (OKLab/OKLCH) + const srgbToLinear = (u) => (u <= 0.04045 ? u / 12.92 : Math.pow((u + 0.055) / 1.055, 2.4)); + const linearToSrgb = (u) => (u <= 0.0031308 ? 12.92 * u : 1.055 * Math.pow(Math.max(0, u), 1 / 2.4) - 0.055); + const rgbToOklab = (r, g, b) => { + const rl = srgbToLinear(r), gl = srgbToLinear(g), bl = srgbToLinear(b); + const l = Math.cbrt(0.4122214708 * rl + 0.5363325363 * gl + 0.0514459929 * bl); + const m = Math.cbrt(0.2119034982 * rl + 0.6806995451 * gl + 0.1073969566 * bl); + const s = Math.cbrt(0.0883024619 * rl + 0.2817188376 * gl + 0.6299787005 * bl); + const L = 0.2104542553 * l + 0.7936177850 * m - 0.0040720468 * s; + const a = 1.9779984951 * l - 2.4285922050 * m + 0.4505937099 * s; + const b2 = 0.0259040371 * l + 0.7827717662 * m - 0.8086757660 * s; + return { L, a, b: b2 }; + }; + const oklabToRgb = (L, a, b) => { + const l_ = L + 0.3963377774 * a + 0.2158037573 * b; + const m_ = L - 0.1055613458 * a - 0.0638541728 * b; + const s_ = L - 0.0894841775 * a - 1.2914855480 * b; + const l = l_ * l_ * l_; + const m = m_ * m_ * m_; + const s = s_ * s_ * s_; + const r = linearToSrgb(+4.0767416621 * l - 3.3077115913 * m + 0.2309699292 * s); + const g = linearToSrgb(-1.2684380046 * l + 2.6097574011 * m - 0.3413193965 * s); + const b3 = linearToSrgb(-0.0041960863 * l - 0.7034186147 * m + 1.7076147010 * s); + return { r, g, b: b3 }; + }; + const oklchToOklab = (L, C, hDeg) => { const h = (hDeg * Math.PI) / 180; return { L, a: C * Math.cos(h), b: C * Math.sin(h) }; }; + const oklabToOklch = (L, a, b) => { const C = Math.sqrt(a * a + b * b); let h = Math.atan2(b, a) * 180 / Math.PI; if (h < 0) h += 360; return { L, C, h }; }; + const clamp01 = (x) => Math.min(1, Math.max(0, x)); + const isInGamut = ({ r, g, b }) => r >= 0 && r <= 1 && g >= 0 && g <= 1 && b >= 0 && b <= 1; + const toHex = ({ r, g, b }) => { + const R = Math.round(clamp01(r) * 255), G = Math.round(clamp01(g) * 255), B = Math.round(clamp01(b) * 255); + const h = (n) => n.toString(16).padStart(2, '0'); + return `#${h(R)}${h(G)}${h(B)}`.toUpperCase(); + }; + const oklchToHexSafe = (L, C, h) => { let c = C; for (let i = 0; i < 12; i++) { const { a, b } = oklchToOklab(L, c, h); const rgb = oklabToRgb(L, a, b); if (isInGamut(rgb)) return toHex(rgb); c = Math.max(0, c - 0.02); } return toHex(oklabToRgb(L, 0, 0)); }; + const parseCssColorToRgb = (css) => { try { const el = document.createElement('span'); el.style.color = css; document.body.appendChild(el); const cs = getComputedStyle(el).color; document.body.removeChild(el); const m = cs.match(/rgba?\((\d+),\s*(\d+),\s*(\d+)/i); if (!m) return null; return { r: Number(m[1]) / 255, g: Number(m[2]) / 255, b: Number(m[3]) / 255 }; } catch { return null; } }; + + // Get primary color in OKLCH format to preserve precision + const getPrimaryOKLCH = () => { + const css = getCssVar('--primary-color'); + if (!css) return null; + + // For OKLCH colors, return the exact values without conversion + if (css.includes('oklch')) { + const oklchMatch = css.match(/oklch\(([^)]+)\)/); + if (oklchMatch) { + const values = oklchMatch[1].split(/\s+/).map(v => parseFloat(v.trim())); + if (values.length >= 3) { + const [L, C, h] = values; + return { L, C, h }; + } + } + } + + // For non-OKLCH colors, convert to OKLCH for consistency + const rgb = parseCssColorToRgb(css); + if (rgb) { + const { L, a, b } = rgbToOklab(rgb.r, rgb.g, rgb.b); + const { C, h } = oklabToOklch(L, a, b); + return { L, C, h }; + } + return null; + }; + + // Keep getPrimaryHex for backward compatibility, but now it converts from OKLCH + const getPrimaryHex = () => { + const oklch = getPrimaryOKLCH(); + if (!oklch) return null; + + const { a, b } = oklchToOklab(oklch.L, oklch.C, oklch.h); + const rgb = oklabToRgb(oklch.L, a, b); + return toHex(rgb); + }; + // No count management via CSS anymore; counts are passed directly to the API + + const generators = { + categorical: (baseOKLCH, count) => { + const { L, C, h } = baseOKLCH; + const L0 = Math.min(0.85, Math.max(0.4, L)); + const C0 = Math.min(0.35, Math.max(0.1, C || 0.2)); + const total = Math.max(1, Math.min(12, count || 8)); + const hueStep = 360 / total; + const results = []; + for (let i = 0; i < total; i++) { + const hDeg = (h + i * hueStep) % 360; + const lVar = ((i % 3) - 1) * 0.04; + results.push(oklchToHexSafe(Math.max(0.4, Math.min(0.85, L0 + lVar)), C0, hDeg)); + } + return results; + }, + sequential: (baseOKLCH, count) => { + const { L, C, h } = baseOKLCH; + const total = Math.max(1, Math.min(12, count || 8)); + const startL = Math.max(0.25, L - 0.18); + const endL = Math.min(0.92, L + 0.18); + const cBase = Math.min(0.33, Math.max(0.08, C * 0.9 + 0.06)); + const out = []; + for (let i = 0; i < total; i++) { + const t = total === 1 ? 0 : i / (total - 1); + const lNow = startL * (1 - t) + endL * t; + const cNow = cBase * (0.85 + 0.15 * (1 - Math.abs(0.5 - t) * 2)); + out.push(oklchToHexSafe(lNow, cNow, h)); + } + return out; + }, + diverging: (baseOKLCH, count) => { + const { L, C, h } = baseOKLCH; + const total = Math.max(1, Math.min(12, count || 8)); + + // Left endpoint: EXACT primary color (no darkening) + const leftLab = oklchToOklab(L, C, h); + // Right endpoint: complement with same L and similar C (clamped safe) + const compH = (h + 180) % 360; + const cSafe = Math.min(0.35, Math.max(0.08, C)); + const rightLab = oklchToOklab(L, cSafe, compH); + const whiteLab = { L: 0.98, a: 0, b: 0 }; // center near‑white + + const hexFromOKLab = (L, a, b) => toHex(oklabToRgb(L, a, b)); + const lerp = (a, b, t) => a + (b - a) * t; + const lerpOKLabHex = (A, B, t) => hexFromOKLab(lerp(A.L, B.L, t), lerp(A.a, B.a, t), lerp(A.b, B.b, t)); + + const out = []; + if (total % 2 === 1) { + const nSide = (total - 1) >> 1; // items on each side + // Left side: include left endpoint exactly at index 0 + for (let i = 0; i < nSide; i++) { + const t = nSide <= 1 ? 0 : (i / (nSide - 1)); // 0 .. 1 + // Move from leftLab to a value close (but not equal) to white; ensure last before center is lighter + const tt = t * 0.9; // keep some distance from pure white before center + out.push(lerpOKLabHex(leftLab, whiteLab, tt)); + } + // Center + out.push(hexFromOKLab(whiteLab.L, whiteLab.a, whiteLab.b)); + // Right side: start near white and end EXACTLY at rightLab + for (let i = 0; i < nSide; i++) { + const t = nSide <= 1 ? 1 : ((i + 1) / nSide); // (1/n)..1 + const tt = Math.max(0.1, t); // avoid starting at pure white + out.push(lerpOKLabHex(whiteLab, rightLab, tt)); + } + // Ensure first and last are exact endpoints + if (out.length) { out[0] = hexFromOKLab(leftLab.L, leftLab.a, leftLab.b); out[out.length - 1] = hexFromOKLab(rightLab.L, rightLab.a, rightLab.b); } + } else { + const nSide = total >> 1; + // Left half including left endpoint, approaching white but not reaching it + for (let i = 0; i < nSide; i++) { + const t = nSide <= 1 ? 0 : (i / (nSide - 1)); // 0 .. 1 + const tt = t * 0.9; + out.push(lerpOKLabHex(leftLab, whiteLab, tt)); + } + // Right half: mirror from near white to exact right endpoint + for (let i = 0; i < nSide; i++) { + const t = nSide <= 1 ? 1 : ((i + 1) / nSide); // (1/n)..1 + const tt = Math.max(0.1, t); + out.push(lerpOKLabHex(whiteLab, rightLab, tt)); + } + if (out.length) { out[0] = hexFromOKLab(leftLab.L, leftLab.a, leftLab.b); out[out.length - 1] = hexFromOKLab(rightLab.L, rightLab.a, rightLab.b); } + } + return out; + } + }; + + let lastSignature = ''; + + const updatePalettes = () => { + const primaryOKLCH = getPrimaryOKLCH(); + const primaryHex = getPrimaryHex(); + const signature = `${primaryOKLCH?.L},${primaryOKLCH?.C},${primaryOKLCH?.h}`; + if (signature === lastSignature) return; + lastSignature = signature; + try { document.dispatchEvent(new CustomEvent('palettes:updated', { detail: { primary: primaryHex, primaryOKLCH } })); } catch { } + }; + + const bootstrap = () => { + // Initial setup - only run once on page load + updatePalettes(); + + // Observer will handle all subsequent changes + const mo = new MutationObserver(() => updatePalettes()); + mo.observe(MODE.cssRoot, { attributes: true, attributeFilter: ['style', 'data-theme'] }); + + // Utility: choose high-contrast (or softened) text style against an arbitrary background color + const pickTextStyleForBackground = (bgCss, opts = {}) => { + const cssRoot = document.documentElement; + const getCssVar = (name) => { + try { return getComputedStyle(cssRoot).getPropertyValue(name).trim(); } catch { return ''; } + }; + const resolveCssToRgb01 = (css) => { + const rgb = parseCssColorToRgb(css); + if (!rgb) return null; + return rgb; // already 0..1 + }; + const mixRgb01 = (a, b, t) => ({ r: a.r * (1 - t) + b.r * t, g: a.g * (1 - t) + b.g * t, b: a.b * (1 - t) + b.b * t }); + const relLum = (rgb) => { + const f = (u) => srgbToLinear(u); + return 0.2126 * f(rgb.r) + 0.7152 * f(rgb.g) + 0.0722 * f(rgb.b); + }; + const contrast = (fg, bg) => { + const L1 = relLum(fg), L2 = relLum(bg); const a = Math.max(L1, L2), b = Math.min(L1, L2); + return (a + 0.05) / (b + 0.05); + }; + try { + const bg = resolveCssToRgb01(bgCss); + if (!bg) return { fill: getCssVar('--text-color') || '#000', stroke: 'var(--transparent-page-contrast)', strokeWidth: 1 }; + const candidatesCss = [getCssVar('--text-color') || '#111', getCssVar('--on-primary') || '#0f1115', '#000', '#fff']; + const candidates = candidatesCss + .map(css => ({ css, rgb: resolveCssToRgb01(css) })) + .filter(x => !!x.rgb); + // Pick the max contrast + let best = candidates[0]; let bestCR = contrast(best.rgb, bg); + for (let i = 1; i < candidates.length; i++) { + const cr = contrast(candidates[i].rgb, bg); + if (cr > bestCR) { best = candidates[i]; bestCR = cr; } + } + // Optional softening via blend factor (0..1), blending towards muted color + const blend = Math.min(1, Math.max(0, Number(opts.blend || 0))); + let finalRgb = best.rgb; + if (blend > 0) { + const mutedCss = getCssVar('--muted-color') || (getCssVar('--text-color') || '#111'); + const mutedRgb = resolveCssToRgb01(mutedCss) || best.rgb; + finalRgb = mixRgb01(best.rgb, mutedRgb, blend); + } + const haloStrength = Math.min(1, Math.max(0, Number(opts.haloStrength == null ? 0.5 : opts.haloStrength))); + const stroke = (best.css === '#000' || best.css.toLowerCase() === 'black') ? `rgba(255,255,255,${0.30 + 0.40 * haloStrength})` : `rgba(0,0,0,${0.30 + 0.30 * haloStrength})`; + return { fill: toHex(finalRgb), stroke, strokeWidth: (opts.haloWidth == null ? 1 : Number(opts.haloWidth)) }; + } catch { + return { fill: getCssVar('--text-color') || '#000', stroke: 'var(--transparent-page-contrast)', strokeWidth: 1 }; + } + }; + window.ColorPalettes = { + refresh: updatePalettes, + notify: () => { try { const primaryOKLCH = getPrimaryOKLCH(); const primaryHex = getPrimaryHex(); document.dispatchEvent(new CustomEvent('palettes:updated', { detail: { primary: primaryHex, primaryOKLCH } })); } catch { } }, + getPrimary: () => getPrimaryHex(), + getPrimaryOKLCH: () => getPrimaryOKLCH(), + getColors: (key, count = 6) => { + const primaryOKLCH = getPrimaryOKLCH(); + if (!primaryOKLCH) return []; + const total = Math.max(1, Math.min(12, Number(count) || 6)); + if (key === 'categorical') return generators.categorical(primaryOKLCH, total); + if (key === 'sequential') return generators.sequential(primaryOKLCH, total); + if (key === 'diverging') return generators.diverging(primaryOKLCH, total); + return []; + }, + getTextStyleForBackground: (bgCss, opts) => pickTextStyleForBackground(bgCss, opts || {}), + chooseReadableText: (bgCss, opts) => pickTextStyleForBackground(bgCss, opts || {}) + }; + }; + + if (document.readyState === 'loading') document.addEventListener('DOMContentLoaded', bootstrap, { once: true }); + else bootstrap(); +})(); + + diff --git a/app/public/thumb.png b/app/public/thumb.png new file mode 100644 index 0000000000000000000000000000000000000000..ef3ed3dd0ba629c350151af4dfde3ce8c9222ca6 --- /dev/null +++ b/app/public/thumb.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:ae7bfe85551fa5f70df5341e6c3a5d5d5f0d68553d9a137725fda61d55627ded +size 279121 diff --git a/app/scripts/EXPORT-PDF-BOOK.md b/app/scripts/EXPORT-PDF-BOOK.md new file mode 100644 index 0000000000000000000000000000000000000000..25ed30daffd561c93dff4f8baf570f0e97eeb661 --- /dev/null +++ b/app/scripts/EXPORT-PDF-BOOK.md @@ -0,0 +1,311 @@ +# 📚 Export PDF Book avec Paged.js + +Système de génération de PDF professionnel avec mise en page type livre, propulsé par **Paged.js**. + +## ✨ Fonctionnalités + +### Mise en page professionnelle +- ✅ **Pagination automatique** avec Paged.js +- ✅ **Running headers** : titres de chapitres en haut de page +- ✅ **Numérotation des pages** : alternée gauche/droite +- ✅ **Marges asymétriques** : optimisées pour reliure (recto/verso) +- ✅ **Gestion veuves et orphelines** : évite les lignes isolées +- ✅ **Typographie professionnelle** : justification, césure automatique + +### Éléments de livre +- 📖 Compteurs automatiques : chapitres, figures, tableaux +- 📑 Notes de bas de page (si implémentées) +- 🔢 Numérotation hiérarchique (1.2.3, etc.) +- 📊 Support complet des visualisations D3/Plotly +- 🖼️ Figures avec légendes numérotées +- 📝 Citations et références + +## 🚀 Utilisation + +### Commande de base + +```bash +npm run export:pdf:book +``` + +Cette commande va : +1. Builder le site Astro (si nécessaire) +2. Démarrer un serveur preview +3. Charger la page et injecter Paged.js +4. Paginer le contenu automatiquement +5. Générer le PDF dans `dist/article-book.pdf` + +### Options disponibles + +```bash +# Thème sombre +npm run export:pdf:book -- --theme=dark + +# Format personnalisé +npm run export:pdf:book -- --format=Letter + +# Nom de fichier custom +npm run export:pdf:book -- --filename=mon-livre + +# Combinaison d'options +npm run export:pdf:book -- --theme=light --format=A4 --filename=thesis +``` + +#### Options détaillées + +| Option | Valeurs | Défaut | Description | +|--------|---------|--------|-------------| +| `--theme` | `light`, `dark` | `light` | Thème de couleur | +| `--format` | `A4`, `Letter`, `Legal`, `A3`, `Tabloid` | `A4` | Format de page | +| `--filename` | `string` | `article-book` | Nom du fichier de sortie | +| `--wait` | `full`, `images`, `plotly`, `d3` | `full` | Stratégie d'attente | + +## 📐 Format de page + +Le système utilise des marges optimisées pour l'impression livre : + +### Pages de droite (recto) +- Marge gauche : **25mm** (reliure) +- Marge droite : **20mm** +- Header droite : titre de section +- Footer droite : numéro de page + +### Pages de gauche (verso) +- Marge gauche : **20mm** +- Marge droite : **25mm** (reliure) +- Header gauche : titre de chapitre +- Footer gauche : numéro de page + +### Première page +- Marges augmentées (40mm haut/bas) +- Pas de headers/footers +- Centrée + +## 🎨 Personnalisation CSS + +Le style livre est défini dans : +``` +app/src/styles/_print-book.css +``` + +### Modifier les marges + +```css +@page { + margin-top: 20mm; + margin-bottom: 25mm; + /* ... */ +} + +@page :left { + margin-left: 20mm; + margin-right: 25mm; +} + +@page :right { + margin-left: 25mm; + margin-right: 20mm; +} +``` + +### Modifier la typographie + +```css +body { + font-family: "Georgia", "Palatino", "Times New Roman", serif; + font-size: 11pt; + line-height: 1.6; +} + +h2 { + font-size: 18pt; + /* ... */ +} +``` + +### Personnaliser les running headers + +```css +@page :left { + @top-left { + content: string(chapter-title); + font-size: 9pt; + font-style: italic; + /* ... */ + } +} + +@page :right { + @top-right { + content: string(section-title); + /* ... */ + } +} +``` + +### Ajouter un logo/filigrane + +```css +@page { + background-image: url('/logo.png'); + background-position: bottom center; + background-size: 20mm; + background-repeat: no-repeat; +} +``` + +## 🔧 Configuration Paged.js avancée + +### Hooks JavaScript personnalisés + +Vous pouvez ajouter des hooks Paged.js dans le script `export-pdf-book.mjs` : + +```javascript +// Après l'injection de Paged.js +await page.evaluate(() => { + class BookHooks extends window.Paged.Handler { + beforeParsed(content) { + // Modifier le contenu avant pagination + } + + afterParsed(parsed) { + // Après l'analyse + } + + afterRendered(pages) { + // Après le rendu de toutes les pages + console.log(`Rendered ${pages.length} pages`); + } + } + + window.Paged.registerHandlers(BookHooks); +}); +``` + +### Forcer des sauts de page + +Dans votre MDX : + +```mdx +## Chapitre 1 + +Contenu... + +
        + +## Chapitre 2 (commence sur une nouvelle page) +``` + +Ou avec une classe CSS : + +```css +.chapter-break { + break-after: page; +} +``` + +## 📊 Visualisations + +Les graphiques D3 et Plotly sont automatiquement : +- ✅ Redimensionnés pour le format livre +- ✅ Rendus en haute qualité +- ✅ Évitent les coupures de page +- ✅ Conservent l'interactivité dans le HTML source + +## 🐛 Dépannage + +### Le PDF est vide ou incomplet + +```bash +# Augmenter le temps d'attente +npm run export:pdf:book -- --wait=full +``` + +### Les images ne s'affichent pas + +Vérifiez que les chemins d'images sont **absolus** dans le HTML : +```html + + + + + +``` + +### Les graphiques sont coupés + +Ajoutez dans `_print-book.css` : +```css +.your-chart-class { + max-height: 180mm !important; + break-inside: avoid; +} +``` + +### Erreur "Paged.js not found" + +```bash +# Réinstaller Paged.js +cd app +npm install pagedjs +``` + +### Le serveur ne démarre pas + +```bash +# Port déjà utilisé ? Changer le port +PREVIEW_PORT=8081 npm run export:pdf:book +``` + +## 📚 Ressources Paged.js + +- **Documentation officielle** : https://pagedjs.org/documentation/ +- **Spécifications CSS Paged Media** : https://www.w3.org/TR/css-page-3/ +- **Exemples** : https://pagedjs.org/examples/ + +## 🆚 Différences avec export:pdf standard + +| Fonctionnalité | `export:pdf` | `export:pdf:book` | +|----------------|--------------|-------------------| +| Pagination | Navigateur standard | Paged.js professionnel | +| Running headers | ❌ | ✅ | +| Marges reliure | ❌ | ✅ | +| Numérotation avancée | ❌ | ✅ | +| Compteurs automatiques | ❌ | ✅ | +| Gestion veuves/orphelines | Basique | Avancée | +| Notes de bas de page | ❌ | ✅ (si activées) | +| Contrôle typographique | Standard | Professionnel | +| Table des matières | Manuelle | Automatique (avec CSS) | + +## 💡 Conseils pour un résultat optimal + +1. **Structurez votre contenu** avec des `

        ` pour les chapitres +2. **Utilisez des `

        ` pour les sections** (apparaissent dans les headers) +3. **Ajoutez des IDs** aux titres pour les références croisées +4. **Optimisez les images** : résolution 300 DPI pour l'impression +5. **Testez le rendu** avant l'impression finale +6. **Évitez les couleurs vives** en mode print (privilégier les niveaux de gris) + +## 🎯 Cas d'usage + +Ce système est idéal pour : +- 📘 **Thèses et mémoires** +- 📗 **Livres techniques** +- 📕 **Rapports académiques** +- 📙 **Documentation longue** +- 📓 **E-books premium** +- 📔 **Revues scientifiques** + +## 🔮 Améliorations futures + +- [ ] Génération automatique de table des matières +- [ ] Support des index +- [ ] Références croisées automatiques +- [ ] Export en EPUB +- [ ] Templates de livre préconfigurés +- [ ] Mode "two-up" pour visualisation double page + +--- + +**Créé avec ❤️ par votre équipe template** + diff --git a/app/scripts/README-PDF-BOOK.md b/app/scripts/README-PDF-BOOK.md new file mode 100644 index 0000000000000000000000000000000000000000..5f106d9c91a2a8df6b26deb1aa3ca92ad8af25b9 --- /dev/null +++ b/app/scripts/README-PDF-BOOK.md @@ -0,0 +1,309 @@ +# 📚 Export PDF Livre - Guide Complet + +Système de génération de PDF professionnel avec mise en page type livre pour votre template d'article scientifique. + +## 🎯 Objectif + +Créer des PDFs de qualité professionnelle avec : +- Typographie soignée (Georgia, justification, césure) +- Marges asymétriques pour reliure +- Running headers avec titres de chapitres +- Numérotation de pages gauche/droite +- Gestion veuves et orphelines +- Style livre académique/éditorial + +## 📦 Ce qui a été créé + +### Fichiers créés + +``` +app/ +├── scripts/ +│ ├── export-pdf-book.mjs ← Version avec Paged.js (avancée, en cours) +│ ├── export-pdf-book-simple.mjs ← Version simple (RECOMMANDÉE ✅) +│ └── EXPORT-PDF-BOOK.md ← Documentation détaillée +└── src/ + └── styles/ + └── _print-book.css ← Styles CSS Paged Media +``` + +### Commandes npm ajoutées + +```json +{ + "export:pdf:book": "Version Paged.js (expérimentale)", + "export:pdf:book:simple": "Version simple (stable ✅)" +} +``` + +## 🚀 Utilisation + +### Commande recommandée + +```bash +npm run export:pdf:book:simple +``` + +Le PDF sera généré dans : +- `dist/article-book.pdf` +- `public/article-book.pdf` (copie automatique) + +### Options disponibles + +```bash +# Thème sombre +npm run export:pdf:book:simple -- --theme=dark + +# Format Letter +npm run export:pdf:book:simple -- --format=Letter + +# Nom personnalisé +npm run export:pdf:book:simple -- --filename=ma-these + +# Combinaison +npm run export:pdf:book:simple -- --theme=light --format=A4 --filename=livre +``` + +## 🎨 Caractéristiques du style livre + +### Marges + +``` +Pages droites (recto) │ Pages gauches (verso) + │ + 20mm ──┐ │ ┌── 25mm + │ │ │ + ┌───────┴──────┐ │ ┌──────┴───────┐ + │ │ │ │ │ + │ CONTENU │ │ │ CONTENU │ + │ │ │ │ │ + └──────────────┘ │ └──────────────┘ + 25mm │ 20mm + (reliure) │ (reliure) +``` + +### Typographie + +- **Police** : Georgia, Palatino (serif) +- **Taille** : 11pt +- **Interlignage** : 1.6 +- **Alignement** : Justifié avec césure automatique +- **Retrait** : 5mm pour les paragraphes suivants + +### Titres + +```css +H2 (Chapitres) → 18pt, numérotés (1. 2. 3.) +H3 (Sections) → 14pt, numérotés (1.1, 1.2) +H4 (Sous-sections) → 12pt +``` + +### Compteurs automatiques + +- Chapitres : 1, 2, 3... +- Sections : 1.1, 1.2, 2.1... +- Figures : Figure 1.1, Figure 1.2... +- Tableaux : idem + +## 📐 Configuration CSS + +Le fichier `_print-book.css` contient tous les styles. Vous pouvez personnaliser : + +### Changer les polices + +```css +body { + font-family: "Baskerville", "Georgia", serif; + font-size: 12pt; +} +``` + +### Ajuster les marges + +```css +@page { + margin-top: 25mm; + margin-bottom: 30mm; +} + +@page :left { + margin-left: 18mm; + margin-right: 30mm; +} +``` + +### Personnaliser les headers + +```css +@page :left { + @top-left { + content: string(chapter-title); + font-size: 10pt; + color: #333; + } +} +``` + +### Forcer un saut de page + +Dans votre MDX : +```mdx +## Chapitre 1 + +Contenu... + +--- + +## Chapitre 2 (nouvelle page) +``` + +Ou avec CSS : +```css +.new-chapter { + break-before: page; +} +``` + +## 🆚 Comparaison des versions + +| Fonctionnalité | Simple | Paged.js | +|----------------|--------|----------| +| **Stabilité** | ✅ Excellente | ⚠️ En cours | +| **Vitesse** | ✅ Rapide | ⏱️ Plus lent | +| **Setup** | ✅ Aucun | 📦 Paged.js requis | +| **Marges reliure** | ✅ | ✅ | +| **Running headers** | ⚠️ Limité | ✅ Avancé | +| **Notes de bas de page** | ❌ | ✅ | +| **Table matières auto** | ❌ | ✅ | +| **Qualité typo** | ⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ | + +### Quand utiliser quelle version ? + +**Version Simple** (recommandée) : +- ✅ Pour la plupart des cas d'usage +- ✅ Stabilité prioritaire +- ✅ Génération rapide +- ✅ Résultats prévisibles + +**Version Paged.js** (expérimentale) : +- 🔬 Pour tester les fonctionnalités avancées +- 📚 Si vous avez besoin de notes de bas de page +- 📖 Pour des tables des matières générées automatiquement +- ⚠️ Nécessite plus de tests + +## 🐛 Dépannage + +### Le PDF est vide + +```bash +# Reconstruire d'abord +npm run build +npm run export:pdf:book:simple +``` + +### Les images manquent + +Vérifiez que les chemins sont absolus : +```html + + + + + +``` + +### Les graphiques sont coupés + +Dans `_print-book.css`, ajoutez : +```css +.your-chart { + max-height: 200mm; + break-inside: avoid; +} +``` + +### Port 8080 déjà utilisé + +```bash +PREVIEW_PORT=8081 npm run export:pdf:book:simple +``` + +## 🎓 Prochaines étapes + +### Améliorations possibles + +1. **Finaliser Paged.js** pour les fonctionnalités avancées +2. **Table des matières automatique** avec numéros de page +3. **Index** généré automatiquement +4. **Références croisées** (Voir Figure 2.3, etc.) +5. **Templates prédéfinis** : + - Thèse académique + - Rapport technique + - Livre scientifique + - Documentation + +### Contribuer + +Les styles sont dans `_print-book.css`. Pour proposer des améliorations : + +1. Testez avec votre contenu +2. Modifiez le CSS +3. Générez le PDF +4. Partagez vos modifications ! + +## 📚 Ressources + +### CSS Paged Media + +- [W3C Spec](https://www.w3.org/TR/css-page-3/) +- [CSS Tricks Guide](https://css-tricks.com/css-paged-media-guide/) +- [Print CSS Documentation](https://www.smashingmagazine.com/2015/01/designing-for-print-with-css/) + +### Paged.js + +- [Documentation](https://pagedjs.org/documentation/) +- [Exemples](https://pagedjs.org/examples/) +- [W3C Paged Media](https://www.w3.org/TR/css-page-3/) + +### Typographie de livre + +- [Butterick's Practical Typography](https://practicaltypography.com/) +- [The Elements of Typographic Style](http://webtypography.net/) + +## 💡 Cas d'usage + +Ce système est idéal pour : + +- 📘 **Thèses de doctorat** +- 📗 **Mémoires de master** +- 📕 **Rapports de recherche** +- 📙 **Documentation technique** +- 📓 **Livres blancs** +- 📔 **Livres auto-publiés** +- 📚 **Collections d'articles** + +## 🎉 Résultat + +Avec ce système, vous obtenez : + +✅ **PDF prêt pour l'impression** +- Marges correctes pour reliure +- Typographie professionnelle +- Mise en page cohérente + +✅ **Qualité éditoriale** +- Numérotation automatique +- Gestion des veuves/orphelines +- Césure propre + +✅ **Workflow moderne** +- Écriture en MDX +- Build automatisé +- Un seul fichier source + +--- + +**Créé avec ❤️ pour le Research Article Template** + +*Profitez de votre nouveau système d'export PDF livre !* 📚✨ + diff --git a/app/scripts/export-latex.mjs b/app/scripts/export-latex.mjs new file mode 100755 index 0000000000000000000000000000000000000000..d4efc9d206db22dcb9486afc2a50a9edd0bc4d37 --- /dev/null +++ b/app/scripts/export-latex.mjs @@ -0,0 +1,358 @@ +#!/usr/bin/env node +import { spawn } from 'node:child_process'; +import { promises as fs } from 'node:fs'; +import { resolve, dirname, basename, extname } from 'node:path'; +import process from 'node:process'; + +async function run(command, args = [], options = {}) { + return new Promise((resolvePromise, reject) => { + const child = spawn(command, args, { stdio: 'inherit', shell: false, ...options }); + child.on('error', reject); + child.on('exit', (code) => { + if (code === 0) resolvePromise(undefined); + else reject(new Error(`${command} ${args.join(' ')} exited with code ${code}`)); + }); + }); +} + +function parseArgs(argv) { + const out = {}; + for (const arg of argv.slice(2)) { + if (!arg.startsWith('--')) continue; + const [k, v] = arg.replace(/^--/, '').split('='); + out[k] = v === undefined ? true : v; + } + return out; +} + +function slugify(text) { + return String(text || '') + .normalize('NFKD') + .replace(/\p{Diacritic}+/gu, '') + .toLowerCase() + .replace(/[^a-z0-9]+/g, '-') + .replace(/^-+|-+$/g, '') + .slice(0, 120) || 'article'; +} + +async function checkPandocInstalled() { + try { + await run('pandoc', ['--version'], { stdio: 'pipe' }); + return true; + } catch { + return false; + } +} + +async function readMdxFile(filePath) { + try { + const content = await fs.readFile(filePath, 'utf-8'); + return content; + } catch (error) { + console.warn(`Warning: Could not read ${filePath}:`, error.message); + return ''; + } +} + +function extractFrontmatter(content) { + const frontmatterMatch = content.match(/^---\n([\s\S]*?)\n---\n/); + if (!frontmatterMatch) return { frontmatter: {}, content }; + + const frontmatterText = frontmatterMatch[1]; + const contentWithoutFrontmatter = content.replace(frontmatterMatch[0], ''); + + // More robust YAML parsing that handles complex structures + const frontmatter = {}; + const lines = frontmatterText.split('\n'); + let currentKey = null; + let currentValue = ''; + let inMultiLineValue = false; + let multiLineOperator = null; // '>' or '|' + + for (const line of lines) { + // Check if this is a new key + if (line.match(/^[a-zA-Z_][a-zA-Z0-9_]*\s*:/) && !inMultiLineValue) { + // Save previous key if exists + if (currentKey) { + frontmatter[currentKey] = currentValue.trim(); + } + + const [key, ...valueParts] = line.split(':'); + currentKey = key.trim(); + currentValue = valueParts.join(':').trim(); + + // Check for multi-line operators + if (currentValue.endsWith('>') || currentValue.endsWith('|')) { + multiLineOperator = currentValue.slice(-1); + currentValue = currentValue.slice(0, -1).trim(); + inMultiLineValue = true; + } else if (currentValue) { + inMultiLineValue = false; + } else { + inMultiLineValue = true; + } + } else if (currentKey && (inMultiLineValue || line.match(/^\s/))) { + // Continuation line or nested content + if (inMultiLineValue) { + if (line.trim() === '' && multiLineOperator === '>') { + // Empty line in folded style should become space + currentValue += ' '; + } else { + const lineContent = line.startsWith(' ') ? line : ' ' + line; + currentValue += lineContent; + } + } else { + currentValue += '\n' + line; + } + } + } + + // Save the last key + if (currentKey) { + frontmatter[currentKey] = currentValue.trim(); + } + + return { frontmatter, content: contentWithoutFrontmatter }; +} + +function cleanMdxToMarkdown(content) { + // Remove import statements + content = content.replace(/^import .+?;?\s*$/gm, ''); + + // Remove JSX component calls like + content = content.replace(/<[A-Z][a-zA-Z0-9]*\s*\/>/g, ''); + + // Convert JSX components to simpler markdown + // Handle Sidenote components specially + content = content.replace(/([\s\S]*?)<\/Sidenote>/g, (match, innerContent) => { + // Extract main content and aside content + const asideMatch = innerContent.match(/([\s\S]*?)<\/Fragment>/); + const mainContent = innerContent.replace(/[\s\S]*?<\/Fragment>/, '').trim(); + const asideContent = asideMatch ? asideMatch[1].trim() : ''; + + let result = mainContent; + if (asideContent) { + result += `\n\n> **Note:** ${asideContent}`; + } + return result; + }); + + // Handle Note components + content = content.replace(/]*>([\s\S]*?)<\/Note>/g, (match, innerContent) => { + return `\n> **Note:** ${innerContent.trim()}\n`; + }); + + // Handle Wide and FullWidth components + content = content.replace(/<(Wide|FullWidth)>([\s\S]*?)<\/\1>/g, '$2'); + + // Handle HtmlEmbed components (convert to simple text) + content = content.replace(/]*\/>/g, '*[Interactive content not available in LaTeX]*'); + + // Remove remaining JSX fragments + content = content.replace(/]*>([\s\S]*?)<\/Fragment>/g, '$1'); + content = content.replace(/<[A-Z][a-zA-Z0-9]*[^>]*>([\s\S]*?)<\/[A-Z][a-zA-Z0-9]*>/g, '$1'); + + // Clean up className attributes + content = content.replace(/className="[^"]*"/g, ''); + + // Clean up extra whitespace + content = content.replace(/\n{3,}/g, '\n\n'); + + // Clean up characters that might cause YAML parsing issues + // Remove any potential YAML-style markers that might interfere + content = content.replace(/^---$/gm, ''); + content = content.replace(/^\s*&\s+/gm, ''); // Remove YAML aliases + + return content.trim(); +} + +async function processChapterImports(content, contentDir) { + let processedContent = content; + + // First, extract all import statements and their corresponding component calls + const importPattern = /import\s+(\w+)\s+from\s+["']\.\/chapters\/([^"']+)["'];?/g; + const imports = new Map(); + let match; + + // Collect all imports + while ((match = importPattern.exec(content)) !== null) { + const [fullImport, componentName, chapterPath] = match; + imports.set(componentName, { path: chapterPath, importStatement: fullImport }); + } + + // Remove all import statements + processedContent = processedContent.replace(importPattern, ''); + + // Process each component call + for (const [componentName, { path: chapterPath }] of imports) { + const componentCallPattern = new RegExp(`<${componentName}\\s*\\/>`, 'g'); + + try { + const chapterFile = resolve(contentDir, 'chapters', chapterPath); + const chapterContent = await readMdxFile(chapterFile); + const { content: chapterMarkdown } = extractFrontmatter(chapterContent); + const cleanChapter = cleanMdxToMarkdown(chapterMarkdown); + + processedContent = processedContent.replace(componentCallPattern, cleanChapter); + console.log(`✅ Processed chapter: ${chapterPath}`); + } catch (error) { + console.warn(`Warning: Could not process chapter ${chapterPath}:`, error.message); + processedContent = processedContent.replace(componentCallPattern, `\n*[Chapter ${chapterPath} could not be loaded]*\n`); + } + } + + return processedContent; +} + +function createLatexPreamble(frontmatter) { + const title = frontmatter.title ? frontmatter.title.replace(/\n/g, ' ') : 'Untitled Article'; + const subtitle = frontmatter.subtitle || ''; + const authors = frontmatter.authors || ''; + const date = frontmatter.published || ''; + + return `\\documentclass[11pt,a4paper]{article} +\\usepackage[utf8]{inputenc} +\\usepackage[T1]{fontenc} +\\usepackage{amsmath,amsfonts,amssymb} +\\usepackage{graphicx} +\\usepackage{hyperref} +\\usepackage{booktabs} +\\usepackage{longtable} +\\usepackage{array} +\\usepackage{multirow} +\\usepackage{wrapfig} +\\usepackage{float} +\\usepackage{colortbl} +\\usepackage{pdflscape} +\\usepackage{tabu} +\\usepackage{threeparttable} +\\usepackage{threeparttablex} +\\usepackage{ulem} +\\usepackage{makecell} +\\usepackage{xcolor} +\\usepackage{listings} +\\usepackage{fancyvrb} +\\usepackage{geometry} +\\geometry{margin=1in} + +\\title{${title}${subtitle ? `\\\\\\large ${subtitle}` : ''}} +${authors ? `\\author{${authors}}` : ''} +${date ? `\\date{${date}}` : ''} + +\\begin{document} +\\maketitle +\\tableofcontents +\\newpage + +`; +} + +async function main() { + const cwd = process.cwd(); + const args = parseArgs(process.argv); + + // Check if pandoc is installed + const hasPandoc = await checkPandocInstalled(); + if (!hasPandoc) { + console.error('❌ Pandoc is not installed. Please install it first:'); + console.error(' macOS: brew install pandoc'); + console.error(' Ubuntu: apt-get install pandoc'); + console.error(' Windows: choco install pandoc'); + process.exit(1); + } + + const contentDir = resolve(cwd, 'src/content'); + const articleFile = resolve(contentDir, 'article.mdx'); + + // Check if article.mdx exists + try { + await fs.access(articleFile); + } catch { + console.error(`❌ Could not find article.mdx at ${articleFile}`); + process.exit(1); + } + + console.log('> Reading article content...'); + const articleContent = await readMdxFile(articleFile); + const { frontmatter, content } = extractFrontmatter(articleContent); + + console.log('> Processing chapters...'); + const processedContent = await processChapterImports(content, contentDir); + + console.log('> Converting MDX to Markdown...'); + const markdownContent = cleanMdxToMarkdown(processedContent); + + // Generate output filename + const title = frontmatter.title ? frontmatter.title.replace(/\n/g, ' ') : 'article'; + const outFileBase = args.filename ? String(args.filename).replace(/\.(tex|pdf)$/i, '') : slugify(title); + + // Create temporary markdown file (ensure it's pure markdown without YAML frontmatter) + const tempMdFile = resolve(cwd, 'temp-article.md'); + + // Clean the markdown content to ensure no YAML frontmatter remains + let cleanMarkdown = markdownContent; + // Remove any potential YAML frontmatter that might have leaked through + cleanMarkdown = cleanMarkdown.replace(/^---\n[\s\S]*?\n---\n/, ''); + // Remove any standalone YAML blocks that might cause issues + cleanMarkdown = cleanMarkdown.replace(/^---\n([\s\S]*?)\n---$/gm, ''); + + await fs.writeFile(tempMdFile, cleanMarkdown); + + + console.log('> Converting to LaTeX with Pandoc...'); + const outputLatex = resolve(cwd, 'dist', `${outFileBase}.tex`); + + // Ensure dist directory exists + await fs.mkdir(resolve(cwd, 'dist'), { recursive: true }); + + // Pandoc conversion arguments + const pandocArgs = [ + tempMdFile, + '-o', outputLatex, + '--from=markdown-yaml_metadata_block', // Explicitly exclude YAML metadata parsing + '--to=latex', + '--standalone', + '--toc', + '--number-sections', + '--highlight-style=tango', + '--listings' + ]; + + // Add bibliography if it exists + const bibFile = resolve(contentDir, 'bibliography.bib'); + try { + await fs.access(bibFile); + pandocArgs.push('--bibliography', bibFile); + pandocArgs.push('--citeproc'); + console.log('✅ Found bibliography file, including citations'); + } catch { + console.log('ℹ️ No bibliography file found'); + } + + try { + await run('pandoc', pandocArgs); + console.log(`✅ LaTeX generated: ${outputLatex}`); + + // Optionally compile to PDF if requested + if (args.pdf) { + console.log('> Compiling LaTeX to PDF...'); + const outputPdf = resolve(cwd, 'dist', `${outFileBase}.pdf`); + await run('pdflatex', ['-output-directory', resolve(cwd, 'dist'), outputLatex]); + console.log(`✅ PDF generated: ${outputPdf}`); + } + + } catch (error) { + console.error('❌ Pandoc conversion failed:', error.message); + process.exit(1); + } finally { + // Clean up temporary file + try { + await fs.unlink(tempMdFile); + } catch { } + } +} + +main().catch((err) => { + console.error(err); + process.exit(1); +}); diff --git a/app/scripts/export-pdf-book-simple.mjs b/app/scripts/export-pdf-book-simple.mjs new file mode 100755 index 0000000000000000000000000000000000000000..fe52da66a94ddb09b93cb18ef32cce4003b78576 --- /dev/null +++ b/app/scripts/export-pdf-book-simple.mjs @@ -0,0 +1,416 @@ +#!/usr/bin/env node +/** + * Export PDF Book - Version Simplifiée + * + * Génère un PDF de qualité professionnelle avec mise en page type livre + * directement avec Playwright + CSS Paged Media (sans Paged.js pour plus de stabilité) + * + * Usage : + * npm run export:pdf:book:simple + * npm run export:pdf:book:simple -- --theme=dark --format=A4 + */ + +import { spawn } from 'node:child_process'; +import { setTimeout as delay } from 'node:timers/promises'; +import { chromium } from 'playwright'; +import { resolve, dirname } from 'node:path'; +import { promises as fs } from 'node:fs'; +import { fileURLToPath } from 'node:url'; +import process from 'node:process'; + +const __dirname = dirname(fileURLToPath(import.meta.url)); + +// ============================================================================ +// Utilitaires (réutilisés du script original) +// ============================================================================ + +async function run(command, args = [], options = {}) { + return new Promise((resolvePromise, reject) => { + const child = spawn(command, args, { stdio: 'inherit', shell: false, ...options }); + child.on('error', reject); + child.on('exit', (code) => { + if (code === 0) resolvePromise(undefined); + else reject(new Error(`${command} ${args.join(' ')} exited with code ${code}`)); + }); + }); +} + +async function waitForServer(url, timeoutMs = 60000) { + const start = Date.now(); + while (Date.now() - start < timeoutMs) { + try { + const res = await fetch(url); + if (res.ok) return; + } catch { } + await delay(500); + } + throw new Error(`Server did not start in time: ${url}`); +} + +function parseArgs(argv) { + const out = {}; + for (const arg of argv.slice(2)) { + if (!arg.startsWith('--')) continue; + const [k, v] = arg.replace(/^--/, '').split('='); + out[k] = v === undefined ? true : v; + } + return out; +} + +function slugify(text) { + return String(text || '') + .normalize('NFKD') + .replace(/\p{Diacritic}+/gu, '') + .toLowerCase() + .replace(/[^a-z0-9]+/g, '-') + .replace(/^-+|-+$/g, '') + .slice(0, 120) || 'article'; +} + +async function waitForImages(page, timeoutMs = 15000) { + await page.evaluate(async (timeout) => { + const deadline = Date.now() + timeout; + const imgs = Array.from(document.images || []); + const unloaded = imgs.filter(img => !img.complete || (img.naturalWidth === 0)); + await Promise.race([ + Promise.all(unloaded.map(img => new Promise(res => { + if (img.complete && img.naturalWidth !== 0) return res(undefined); + img.addEventListener('load', () => res(undefined), { once: true }); + img.addEventListener('error', () => res(undefined), { once: true }); + }))), + new Promise(res => setTimeout(res, Math.max(0, deadline - Date.now()))) + ]); + }, timeoutMs); +} + +async function waitForPlotly(page, timeoutMs = 20000) { + await page.evaluate(async (timeout) => { + const start = Date.now(); + const hasPlots = () => Array.from(document.querySelectorAll('.js-plotly-plot')).length > 0; + while (!hasPlots() && (Date.now() - start) < timeout) { + await new Promise(r => setTimeout(r, 200)); + } + const deadline = start + timeout; + const allReady = () => Array.from(document.querySelectorAll('.js-plotly-plot')).every(el => el.querySelector('svg.main-svg')); + while (!allReady() && Date.now() < deadline) { + await new Promise(r => setTimeout(r, 200)); + } + }, timeoutMs); +} + +async function waitForD3(page, timeoutMs = 20000) { + await page.evaluate(async (timeout) => { + const start = Date.now(); + const isReady = () => { + const hero = document.querySelector('.hero-banner'); + if (hero) { + return !!hero.querySelector('svg circle, svg path, svg rect, svg g'); + } + const containers = [ + ...Array.from(document.querySelectorAll('.d3-line')), + ...Array.from(document.querySelectorAll('.d3-bar')) + ]; + if (!containers.length) return true; + return containers.every(c => c.querySelector('svg circle, svg path, svg rect, svg g')); + }; + while (!isReady() && (Date.now() - start) < timeout) { + await new Promise(r => setTimeout(r, 200)); + } + }, timeoutMs); +} + +async function waitForStableLayout(page, timeoutMs = 5000) { + const start = Date.now(); + let last = await page.evaluate(() => document.scrollingElement ? document.scrollingElement.scrollHeight : document.body.scrollHeight); + let stableCount = 0; + while ((Date.now() - start) < timeoutMs && stableCount < 3) { + await page.waitForTimeout(250); + const now = await page.evaluate(() => document.scrollingElement ? document.scrollingElement.scrollHeight : document.body.scrollHeight); + if (now === last) stableCount += 1; else { stableCount = 0; last = now; } + } +} + +async function openAllAccordions(page) { + console.log('📂 Opening all accordions…'); + await page.evaluate(() => { + // Trouver tous les accordéons (details.accordion) + const accordions = document.querySelectorAll('details.accordion, details'); + let openedCount = 0; + + accordions.forEach((accordion) => { + if (!accordion.hasAttribute('open')) { + // Ouvrir l'accordéon en ajoutant l'attribut open + accordion.setAttribute('open', ''); + + // Forcer l'affichage du contenu pour le PDF + const wrapper = accordion.querySelector('.accordion__content-wrapper'); + if (wrapper) { + wrapper.style.height = 'auto'; + wrapper.style.overflow = 'visible'; + } + + openedCount++; + } + }); + + console.log(`Opened ${openedCount} accordion(s)`); + return openedCount; + }); + + // Petit délai pour que les accordéons se stabilisent + await page.waitForTimeout(500); +} + +async function waitForHtmlEmbeds(page, timeoutMs = 15000) { + console.log('⏳ Waiting for HTML embeds to render…'); + await page.evaluate(async (timeout) => { + const start = Date.now(); + + const isEmbedReady = (embed) => { + try { + // Vérifier si l'embed a du contenu + const hasContent = embed.querySelector('svg, canvas, div[id^="frag-"]'); + if (!hasContent) return false; + + // Vérifier si les SVG ont des éléments + const svgs = embed.querySelectorAll('svg'); + for (const svg of svgs) { + const hasShapes = svg.querySelector('path, circle, rect, line, polygon, g'); + if (!hasShapes) return false; + } + + // Vérifier si les canvas ont été dessinés + const canvases = embed.querySelectorAll('canvas'); + for (const canvas of canvases) { + try { + const ctx = canvas.getContext('2d'); + const imageData = ctx.getImageData(0, 0, Math.min(10, canvas.width), Math.min(10, canvas.height)); + // Vérifier si au moins un pixel est non-transparent + const hasPixels = Array.from(imageData.data).some((v, i) => i % 4 === 3 && v > 0); + if (!hasPixels) return false; + } catch (e) { + // Cross-origin ou erreur, on considère que c'est OK + } + } + + return true; + } catch (e) { + return false; + } + }; + + while (Date.now() - start < timeout) { + const embeds = Array.from(document.querySelectorAll('.html-embed__card')); + if (embeds.length === 0) break; // Pas d'embeds dans la page + + const allReady = embeds.every(isEmbedReady); + if (allReady) { + console.log(`All ${embeds.length} HTML embeds ready`); + break; + } + + await new Promise(r => setTimeout(r, 300)); + } + }, timeoutMs); +} + +// ============================================================================ +// Script principal +// ============================================================================ + +async function main() { + const cwd = process.cwd(); + const port = Number(process.env.PREVIEW_PORT || 8080); + const baseUrl = `http://127.0.0.1:${port}/`; + const args = parseArgs(process.argv); + + const theme = (args.theme === 'dark' || args.theme === 'light') ? args.theme : 'light'; + const format = args.format || 'A4'; + const wait = args.wait || 'full'; + + let outFileBase = (args.filename && String(args.filename).replace(/\.pdf$/i, '')) || 'article-book'; + + // Build si nécessaire + const distDir = resolve(cwd, 'dist'); + let hasDist = false; + try { + const st = await fs.stat(distDir); + hasDist = st && st.isDirectory(); + } catch { } + + if (!hasDist) { + console.log('📦 Building Astro site…'); + await run('npm', ['run', 'build']); + } else { + console.log('✓ Using existing dist/ build'); + } + + console.log('🚀 Starting Astro preview server…'); + const preview = spawn('npm', ['run', 'preview'], { cwd, stdio: 'inherit', detached: true }); + const previewExit = new Promise((resolvePreview) => { + preview.on('close', (code, signal) => resolvePreview({ code, signal })); + }); + + try { + await waitForServer(baseUrl, 60000); + console.log('✓ Server ready'); + + console.log('📖 Launching browser…'); + const browser = await chromium.launch({ headless: true }); + + try { + const context = await browser.newContext(); + + // Appliquer le thème + await context.addInitScript((desired) => { + try { + localStorage.setItem('theme', desired); + if (document && document.documentElement) { + document.documentElement.dataset.theme = desired; + } + } catch { } + }, theme); + + const page = await context.newPage(); + + // Viewport pour le contenu + await page.setViewportSize({ width: 1200, height: 1600 }); + + console.log('📄 Loading page…'); + await page.goto(baseUrl, { waitUntil: 'load', timeout: 60000 }); + + // Attendre les libraries + try { await page.waitForFunction(() => !!window.Plotly, { timeout: 8000 }); } catch { } + try { await page.waitForFunction(() => !!window.d3, { timeout: 8000 }); } catch { } + + // Récupérer le nom du fichier + if (!args.filename) { + const fromBtn = await page.evaluate(() => { + const btn = document.getElementById('download-pdf-btn'); + const f = btn ? btn.getAttribute('data-pdf-filename') : null; + return f || ''; + }); + if (fromBtn) { + outFileBase = String(fromBtn).replace(/\.pdf$/i, '') + '-book'; + } else { + const title = await page.evaluate(() => { + const h1 = document.querySelector('h1.hero-title'); + const t = h1 ? h1.textContent : document.title; + return (t || '').replace(/\s+/g, ' ').trim(); + }); + outFileBase = slugify(title) + '-book'; + } + } + + // Attendre le rendu du contenu + if (wait === 'images' || wait === 'full') { + console.log('⏳ Waiting for images…'); + await waitForImages(page); + } + if (wait === 'd3' || wait === 'full') { + console.log('⏳ Waiting for D3…'); + await waitForD3(page); + } + if (wait === 'plotly' || wait === 'full') { + console.log('⏳ Waiting for Plotly…'); + await waitForPlotly(page); + } + if (wait === 'full') { + await waitForHtmlEmbeds(page); + await waitForStableLayout(page); + } + + // Ouvrir tous les accordéons pour qu'ils soient visibles dans le PDF + await openAllAccordions(page); + await waitForStableLayout(page, 2000); + + // Activer le mode print + await page.emulateMedia({ media: 'print' }); + + console.log('📚 Applying book styles…'); + + // Injecter le CSS livre + const bookCssPath = resolve(__dirname, '..', 'src', 'styles', '_print-book.css'); + const bookCss = await fs.readFile(bookCssPath, 'utf-8'); + await page.addStyleTag({ content: bookCss }); + + // Attendre que le style soit appliqué + await page.waitForTimeout(1000); + + // Générer le PDF avec les options appropriées + const outPath = resolve(cwd, 'dist', `${outFileBase}.pdf`); + + console.log('🖨️ Generating PDF…'); + + await page.pdf({ + path: outPath, + format, + printBackground: true, + displayHeaderFooter: false, // On utilise CSS @page à la place + preferCSSPageSize: false, + margin: { + top: '20mm', + right: '20mm', + bottom: '25mm', + left: '25mm' + } + }); + + // Vérifier la taille du PDF + const stats = await fs.stat(outPath); + const sizeKB = Math.round(stats.size / 1024); + + console.log(`✅ PDF generated: ${outPath} (${sizeKB} KB)`); + + if (sizeKB < 10) { + console.warn('⚠️ Warning: PDF is very small, content might be missing'); + } + + // Copier dans public/ + const publicPath = resolve(cwd, 'public', `${outFileBase}.pdf`); + try { + await fs.mkdir(resolve(cwd, 'public'), { recursive: true }); + await fs.copyFile(outPath, publicPath); + console.log(`✅ PDF copied to: ${publicPath}`); + } catch (e) { + console.warn('⚠️ Unable to copy PDF to public/:', e?.message || e); + } + + } finally { + await browser.close(); + } + + } finally { + // Arrêter le serveur preview + console.log('🛑 Stopping preview server…'); + try { + if (process.platform !== 'win32') { + try { process.kill(-preview.pid, 'SIGINT'); } catch { } + } + try { preview.kill('SIGINT'); } catch { } + await Promise.race([previewExit, delay(3000)]); + + if (!preview.killed) { + try { + if (process.platform !== 'win32') { + try { process.kill(-preview.pid, 'SIGKILL'); } catch { } + } + try { preview.kill('SIGKILL'); } catch { } + } catch { } + await Promise.race([previewExit, delay(1000)]); + } + } catch { } + } + + console.log(''); + console.log('╔═══════════════════════════════════════════════════════════════╗'); + console.log('║ 📚 PDF BOOK (SIMPLE) GENERATED! 📚 ║'); + console.log('╚═══════════════════════════════════════════════════════════════╝'); + console.log(''); +} + +main().catch((err) => { + console.error('❌ Error:', err); + process.exit(1); +}); + diff --git a/app/scripts/export-pdf-book.mjs b/app/scripts/export-pdf-book.mjs new file mode 100755 index 0000000000000000000000000000000000000000..888d3d07c0d2c3576681e54e4b65cc08576db375 --- /dev/null +++ b/app/scripts/export-pdf-book.mjs @@ -0,0 +1,360 @@ +#!/usr/bin/env node +/** + * Export PDF Book avec Paged.js + * + * Génère un PDF de qualité professionnelle avec mise en page type livre + * à partir du contenu HTML compilé par Astro. + * + * Fonctionnalités : + * - Pagination automatique avec Paged.js + * - Running headers (titres chapitres en haut de page) + * - Numérotation des pages + * - Marges différentes gauche/droite (reliure) + * - Gestion veuves/orphelines + * - Typographie professionnelle + * + * Usage : + * npm run export:pdf:book + * npm run export:pdf:book -- --theme=dark --format=A4 + * + * Options : + * --theme=light|dark Thème (défaut: light) + * --format=A4|Letter Format de page (défaut: A4) + * --filename=xxx Nom du fichier de sortie + * --wait=full Mode d'attente (défaut: full) + */ + +import { spawn } from 'node:child_process'; +import { setTimeout as delay } from 'node:timers/promises'; +import { chromium } from 'playwright'; +import { resolve, dirname } from 'node:path'; +import { promises as fs } from 'node:fs'; +import { fileURLToPath } from 'node:url'; +import process from 'node:process'; + +const __dirname = dirname(fileURLToPath(import.meta.url)); + +// ============================================================================ +// Utilitaires +// ============================================================================ + +async function run(command, args = [], options = {}) { + return new Promise((resolvePromise, reject) => { + const child = spawn(command, args, { stdio: 'inherit', shell: false, ...options }); + child.on('error', reject); + child.on('exit', (code) => { + if (code === 0) resolvePromise(undefined); + else reject(new Error(`${command} ${args.join(' ')} exited with code ${code}`)); + }); + }); +} + +async function waitForServer(url, timeoutMs = 60000) { + const start = Date.now(); + while (Date.now() - start < timeoutMs) { + try { + const res = await fetch(url); + if (res.ok) return; + } catch { } + await delay(500); + } + throw new Error(`Server did not start in time: ${url}`); +} + +function parseArgs(argv) { + const out = {}; + for (const arg of argv.slice(2)) { + if (!arg.startsWith('--')) continue; + const [k, v] = arg.replace(/^--/, '').split('='); + out[k] = v === undefined ? true : v; + } + return out; +} + +function slugify(text) { + return String(text || '') + .normalize('NFKD') + .replace(/\p{Diacritic}+/gu, '') + .toLowerCase() + .replace(/[^a-z0-9]+/g, '-') + .replace(/^-+|-+$/g, '') + .slice(0, 120) || 'article'; +} + +async function waitForImages(page, timeoutMs = 15000) { + await page.evaluate(async (timeout) => { + const deadline = Date.now() + timeout; + const imgs = Array.from(document.images || []); + const unloaded = imgs.filter(img => !img.complete || (img.naturalWidth === 0)); + await Promise.race([ + Promise.all(unloaded.map(img => new Promise(res => { + if (img.complete && img.naturalWidth !== 0) return res(undefined); + img.addEventListener('load', () => res(undefined), { once: true }); + img.addEventListener('error', () => res(undefined), { once: true }); + }))), + new Promise(res => setTimeout(res, Math.max(0, deadline - Date.now()))) + ]); + }, timeoutMs); +} + +async function waitForPlotly(page, timeoutMs = 20000) { + await page.evaluate(async (timeout) => { + const start = Date.now(); + const hasPlots = () => Array.from(document.querySelectorAll('.js-plotly-plot')).length > 0; + while (!hasPlots() && (Date.now() - start) < timeout) { + await new Promise(r => setTimeout(r, 200)); + } + const deadline = start + timeout; + const allReady = () => Array.from(document.querySelectorAll('.js-plotly-plot')).every(el => el.querySelector('svg.main-svg')); + while (!allReady() && Date.now() < deadline) { + await new Promise(r => setTimeout(r, 200)); + } + }, timeoutMs); +} + +async function waitForD3(page, timeoutMs = 20000) { + await page.evaluate(async (timeout) => { + const start = Date.now(); + const isReady = () => { + const hero = document.querySelector('.hero-banner'); + if (hero) { + return !!hero.querySelector('svg circle, svg path, svg rect, svg g'); + } + const containers = [ + ...Array.from(document.querySelectorAll('.d3-line')), + ...Array.from(document.querySelectorAll('.d3-bar')) + ]; + if (!containers.length) return true; + return containers.every(c => c.querySelector('svg circle, svg path, svg rect, svg g')); + }; + while (!isReady() && (Date.now() - start) < timeout) { + await new Promise(r => setTimeout(r, 200)); + } + }, timeoutMs); +} + +async function waitForStableLayout(page, timeoutMs = 5000) { + const start = Date.now(); + let last = await page.evaluate(() => document.scrollingElement ? document.scrollingElement.scrollHeight : document.body.scrollHeight); + let stableCount = 0; + while ((Date.now() - start) < timeoutMs && stableCount < 3) { + await page.waitForTimeout(250); + const now = await page.evaluate(() => document.scrollingElement ? document.scrollingElement.scrollHeight : document.body.scrollHeight); + if (now === last) stableCount += 1; else { stableCount = 0; last = now; } + } +} + +// ============================================================================ +// Script principal +// ============================================================================ + +async function main() { + const cwd = process.cwd(); + const port = Number(process.env.PREVIEW_PORT || 8080); + const baseUrl = `http://127.0.0.1:${port}/`; + const args = parseArgs(process.argv); + + const theme = (args.theme === 'dark' || args.theme === 'light') ? args.theme : 'light'; + const format = args.format || 'A4'; + const wait = args.wait || 'full'; + + let outFileBase = (args.filename && String(args.filename).replace(/\.pdf$/i, '')) || 'article-book'; + + // Build si nécessaire + const distDir = resolve(cwd, 'dist'); + let hasDist = false; + try { + const st = await fs.stat(distDir); + hasDist = st && st.isDirectory(); + } catch { } + + if (!hasDist) { + console.log('📦 Building Astro site…'); + await run('npm', ['run', 'build']); + } else { + console.log('✓ Using existing dist/ build'); + } + + console.log('🚀 Starting Astro preview server…'); + const preview = spawn('npm', ['run', 'preview'], { cwd, stdio: 'inherit', detached: true }); + const previewExit = new Promise((resolvePreview) => { + preview.on('close', (code, signal) => resolvePreview({ code, signal })); + }); + + try { + await waitForServer(baseUrl, 60000); + console.log('✓ Server ready'); + + console.log('📖 Launching browser with Paged.js…'); + const browser = await chromium.launch({ headless: true }); + + try { + const context = await browser.newContext(); + + // Appliquer le thème + await context.addInitScript((desired) => { + try { + localStorage.setItem('theme', desired); + if (document && document.documentElement) { + document.documentElement.dataset.theme = desired; + } + } catch { } + }, theme); + + const page = await context.newPage(); + + // Viewport large pour le contenu + await page.setViewportSize({ width: 1200, height: 1600 }); + + console.log('📄 Loading page…'); + await page.goto(baseUrl, { waitUntil: 'load', timeout: 60000 }); + + // Attendre les libraries + try { await page.waitForFunction(() => !!window.Plotly, { timeout: 8000 }); } catch { } + try { await page.waitForFunction(() => !!window.d3, { timeout: 8000 }); } catch { } + + // Récupérer le nom du fichier + if (!args.filename) { + const fromBtn = await page.evaluate(() => { + const btn = document.getElementById('download-pdf-btn'); + const f = btn ? btn.getAttribute('data-pdf-filename') : null; + return f || ''; + }); + if (fromBtn) { + outFileBase = String(fromBtn).replace(/\.pdf$/i, '') + '-book'; + } else { + const title = await page.evaluate(() => { + const h1 = document.querySelector('h1.hero-title'); + const t = h1 ? h1.textContent : document.title; + return (t || '').replace(/\s+/g, ' ').trim(); + }); + outFileBase = slugify(title) + '-book'; + } + } + + // Attendre le rendu du contenu + if (wait === 'images' || wait === 'full') { + console.log('⏳ Waiting for images…'); + await waitForImages(page); + } + if (wait === 'd3' || wait === 'full') { + console.log('⏳ Waiting for D3…'); + await waitForD3(page); + } + if (wait === 'plotly' || wait === 'full') { + console.log('⏳ Waiting for Plotly…'); + await waitForPlotly(page); + } + if (wait === 'full') { + await waitForStableLayout(page); + } + + // Activer le mode print AVANT d'injecter Paged.js + await page.emulateMedia({ media: 'print' }); + + console.log('📚 Injecting Paged.js…'); + + // Injecter le CSS livre + const bookCssPath = resolve(__dirname, '..', 'src', 'styles', '_print-book.css'); + const bookCss = await fs.readFile(bookCssPath, 'utf-8'); + await page.addStyleTag({ content: bookCss }); + + // Injecter Paged.js depuis node_modules + const pagedJsPath = resolve(cwd, 'node_modules', 'pagedjs', 'dist', 'paged.polyfill.js'); + await page.addScriptTag({ path: pagedJsPath }); + + console.log('⏳ Running Paged.js pagination…'); + + // Lancer la pagination avec Paged.Previewer + await page.evaluate(async () => { + if (window.Paged && window.Paged.Previewer) { + const previewer = new window.Paged.Previewer(); + await previewer.preview(); + } + }); + + // Attendre que les pages soient créées + await page.waitForFunction(() => { + const pages = document.querySelectorAll('.pagedjs_page'); + return pages && pages.length > 0; + }, { timeout: 60000 }); + + // Petit délai pour s'assurer que tout est stabilisé + await page.waitForTimeout(2000); + + console.log('✓ Pagination complete'); + + // Informations sur la pagination + const pageInfo = await page.evaluate(() => { + const pages = document.querySelectorAll('.pagedjs_page'); + return { + totalPages: pages.length, + hasContent: pages.length > 0 + }; + }); + + console.log(`📄 Generated ${pageInfo.totalPages} pages`); + + // Générer le PDF + const outPath = resolve(cwd, 'dist', `${outFileBase}.pdf`); + + console.log('🖨️ Generating PDF…'); + + await page.pdf({ + path: outPath, + format, + printBackground: true, + preferCSSPageSize: true, // Important : respecte les @page CSS + margin: { top: 0, right: 0, bottom: 0, left: 0 } // Marges gérées par CSS + }); + + console.log(`✅ PDF generated: ${outPath}`); + + // Copier dans public/ + const publicPath = resolve(cwd, 'public', `${outFileBase}.pdf`); + try { + await fs.mkdir(resolve(cwd, 'public'), { recursive: true }); + await fs.copyFile(outPath, publicPath); + console.log(`✅ PDF copied to: ${publicPath}`); + } catch (e) { + console.warn('⚠️ Unable to copy PDF to public/:', e?.message || e); + } + + } finally { + await browser.close(); + } + + } finally { + // Arrêter le serveur preview + console.log('🛑 Stopping preview server…'); + try { + if (process.platform !== 'win32') { + try { process.kill(-preview.pid, 'SIGINT'); } catch { } + } + try { preview.kill('SIGINT'); } catch { } + await Promise.race([previewExit, delay(3000)]); + + if (!preview.killed) { + try { + if (process.platform !== 'win32') { + try { process.kill(-preview.pid, 'SIGKILL'); } catch { } + } + try { preview.kill('SIGKILL'); } catch { } + } catch { } + await Promise.race([previewExit, delay(1000)]); + } + } catch { } + } + + console.log(''); + console.log('╔═══════════════════════════════════════════════════════════════╗'); + console.log('║ 📚 PDF BOOK GENERATED! 📚 ║'); + console.log('╚═══════════════════════════════════════════════════════════════╝'); + console.log(''); +} + +main().catch((err) => { + console.error('❌ Error:', err); + process.exit(1); +}); + diff --git a/app/scripts/export-pdf.mjs b/app/scripts/export-pdf.mjs new file mode 100644 index 0000000000000000000000000000000000000000..4c36e5ba264e88dfdf2cd35174394ebe5d6114c6 --- /dev/null +++ b/app/scripts/export-pdf.mjs @@ -0,0 +1,554 @@ +#!/usr/bin/env node +import { spawn } from 'node:child_process'; +import { setTimeout as delay } from 'node:timers/promises'; +import { chromium } from 'playwright'; +import { resolve } from 'node:path'; +import { promises as fs } from 'node:fs'; +import process from 'node:process'; + +async function run(command, args = [], options = {}) { + return new Promise((resolvePromise, reject) => { + const child = spawn(command, args, { stdio: 'inherit', shell: false, ...options }); + child.on('error', reject); + child.on('exit', (code) => { + if (code === 0) resolvePromise(undefined); + else reject(new Error(`${command} ${args.join(' ')} exited with code ${code}`)); + }); + }); +} + +async function waitForServer(url, timeoutMs = 60000) { + const start = Date.now(); + while (Date.now() - start < timeoutMs) { + try { + const res = await fetch(url); + if (res.ok) return; + } catch { } + await delay(500); + } + throw new Error(`Server did not start in time: ${url}`); +} + +function parseArgs(argv) { + const out = {}; + for (const arg of argv.slice(2)) { + if (!arg.startsWith('--')) continue; + const [k, v] = arg.replace(/^--/, '').split('='); + out[k] = v === undefined ? true : v; + } + return out; +} + +function slugify(text) { + return String(text || '') + .normalize('NFKD') + .replace(/\p{Diacritic}+/gu, '') + .toLowerCase() + .replace(/[^a-z0-9]+/g, '-') + .replace(/^-+|-+$/g, '') + .slice(0, 120) || 'article'; +} + +function parseMargin(margin) { + if (!margin) return { top: '12mm', right: '12mm', bottom: '16mm', left: '12mm' }; + const parts = String(margin).split(',').map(s => s.trim()).filter(Boolean); + if (parts.length === 1) { + return { top: parts[0], right: parts[0], bottom: parts[0], left: parts[0] }; + } + if (parts.length === 2) { + return { top: parts[0], right: parts[1], bottom: parts[0], left: parts[1] }; + } + if (parts.length === 3) { + return { top: parts[0], right: parts[1], bottom: parts[2], left: parts[1] }; + } + return { top: parts[0] || '12mm', right: parts[1] || '12mm', bottom: parts[2] || '16mm', left: parts[3] || '12mm' }; +} + +function cssLengthToMm(val) { + if (!val) return 0; + const s = String(val).trim(); + if (/mm$/i.test(s)) return parseFloat(s); + if (/cm$/i.test(s)) return parseFloat(s) * 10; + if (/in$/i.test(s)) return parseFloat(s) * 25.4; + if (/px$/i.test(s)) return (parseFloat(s) / 96) * 25.4; // 96 CSS px per inch + const num = parseFloat(s); + return Number.isFinite(num) ? num : 0; // assume mm if unitless +} + +function getFormatSizeMm(format) { + const f = String(format || 'A4').toLowerCase(); + switch (f) { + case 'letter': return { w: 215.9, h: 279.4 }; + case 'legal': return { w: 215.9, h: 355.6 }; + case 'a3': return { w: 297, h: 420 }; + case 'tabloid': return { w: 279.4, h: 431.8 }; + case 'a4': + default: return { w: 210, h: 297 }; + } +} + +async function waitForImages(page, timeoutMs = 15000) { + await page.evaluate(async (timeout) => { + const deadline = Date.now() + timeout; + const imgs = Array.from(document.images || []); + const unloaded = imgs.filter(img => !img.complete || (img.naturalWidth === 0)); + await Promise.race([ + Promise.all(unloaded.map(img => new Promise(res => { + if (img.complete && img.naturalWidth !== 0) return res(undefined); + img.addEventListener('load', () => res(undefined), { once: true }); + img.addEventListener('error', () => res(undefined), { once: true }); + }))), + new Promise(res => setTimeout(res, Math.max(0, deadline - Date.now()))) + ]); + }, timeoutMs); +} + +async function waitForPlotly(page, timeoutMs = 20000) { + try { + await page.evaluate(async (timeout) => { + const start = Date.now(); + const hasPlots = () => Array.from(document.querySelectorAll('.js-plotly-plot')).length > 0; + // Wait until plots exist or timeout + while (!hasPlots() && (Date.now() - start) < timeout) { + await new Promise(r => setTimeout(r, 200)); + } + const deadline = start + timeout; + // Then wait until each plot contains the main svg + const allReady = () => Array.from(document.querySelectorAll('.js-plotly-plot')).every(el => el.querySelector('svg.main-svg')); + while (!allReady() && Date.now() < deadline) { + await new Promise(r => setTimeout(r, 200)); + } + console.log('Plotly ready or timeout'); + }, timeoutMs); + } catch (e) { + console.warn('waitForPlotly timeout or error:', e.message); + } +} + +async function waitForD3(page, timeoutMs = 20000) { + try { + await page.evaluate(async (timeout) => { + const start = Date.now(); + const isReady = () => { + // Prioritize hero banner if present (generic container) + const hero = document.querySelector('.hero-banner'); + if (hero) { + return !!hero.querySelector('svg circle, svg path, svg rect, svg g'); + } + // Else require all D3 containers on page to have shapes + const containers = [ + ...Array.from(document.querySelectorAll('.d3-line')), + ...Array.from(document.querySelectorAll('.d3-bar')) + ]; + if (!containers.length) return true; + return containers.every(c => c.querySelector('svg circle, svg path, svg rect, svg g')); + }; + while (!isReady() && (Date.now() - start) < timeout) { + await new Promise(r => setTimeout(r, 200)); + } + console.log('D3 ready or timeout'); + }, timeoutMs); + } catch (e) { + console.warn('waitForD3 timeout or error:', e.message); + } +} + +async function waitForStableLayout(page, timeoutMs = 5000) { + const start = Date.now(); + let last = await page.evaluate(() => document.scrollingElement ? document.scrollingElement.scrollHeight : document.body.scrollHeight); + let stableCount = 0; + while ((Date.now() - start) < timeoutMs && stableCount < 3) { + await page.waitForTimeout(250); + const now = await page.evaluate(() => document.scrollingElement ? document.scrollingElement.scrollHeight : document.body.scrollHeight); + if (now === last) stableCount += 1; else { stableCount = 0; last = now; } + } +} + +async function main() { + const cwd = process.cwd(); + const port = Number(process.env.PREVIEW_PORT || 8080); + const baseUrl = `http://127.0.0.1:${port}/`; + const args = parseArgs(process.argv); + // Default: light (do not rely on env vars implicitly) + const theme = (args.theme === 'dark' || args.theme === 'light') ? args.theme : 'light'; + const format = args.format || 'A4'; + const margin = parseMargin(args.margin); + const wait = (args.wait || 'full'); // 'networkidle' | 'images' | 'plotly' | 'full' + const bookMode = !!args.book; // Activer le mode livre avec --book + + // filename can be provided, else computed from DOM (button) or page title later + let outFileBase = (args.filename && String(args.filename).replace(/\.pdf$/i, '')) || 'article'; + + // Build only if dist/ does not exist + const distDir = resolve(cwd, 'dist'); + let hasDist = false; + try { + const st = await fs.stat(distDir); + hasDist = st && st.isDirectory(); + } catch { } + if (!hasDist) { + console.log('> Building Astro site…'); + await run('npm', ['run', 'build']); + } else { + console.log('> Skipping build (dist/ exists)…'); + } + + console.log('> Starting Astro preview…'); + // Start preview in its own process group so we can terminate all children reliably + const preview = spawn('npm', ['run', 'preview'], { cwd, stdio: 'inherit', detached: true }); + const previewExit = new Promise((resolvePreview) => { + preview.on('close', (code, signal) => resolvePreview({ code, signal })); + }); + + try { + await waitForServer(baseUrl, 60000); + console.log('> Server ready, generating PDF…'); + + const browser = await chromium.launch({ headless: true }); + try { + const context = await browser.newContext(); + await context.addInitScript((desired) => { + try { + localStorage.setItem('theme', desired); + // Apply theme immediately to avoid flashes + if (document && document.documentElement) { + document.documentElement.dataset.theme = desired; + } + } catch { } + }, theme); + const page = await context.newPage(); + // Pre-fit viewport width to printable width so charts size correctly + const fmt = getFormatSizeMm(format); + const mw = fmt.w - cssLengthToMm(margin.left) - cssLengthToMm(margin.right); + const printableWidthPx = Math.max(320, Math.round((mw / 25.4) * 96)); + await page.setViewportSize({ width: printableWidthPx, height: 1200 }); + await page.goto(baseUrl, { waitUntil: 'load', timeout: 60000 }); + // Give time for CDN scripts (Plotly/D3) to attach and for our fragment hooks to run + try { await page.waitForFunction(() => !!window.Plotly, { timeout: 8000 }); } catch { } + try { await page.waitForFunction(() => !!window.d3, { timeout: 8000 }); } catch { } + // Prefer explicit filename from the download button if present + if (!args.filename) { + const fromBtn = await page.evaluate(() => { + const btn = document.getElementById('download-pdf-btn'); + const f = btn ? btn.getAttribute('data-pdf-filename') : null; + return f || ''; + }); + if (fromBtn) { + outFileBase = String(fromBtn).replace(/\.pdf$/i, ''); + } else { + // Fallback: compute slug from hero title or document.title + const title = await page.evaluate(() => { + const h1 = document.querySelector('h1.hero-title'); + const t = h1 ? h1.textContent : document.title; + return (t || '').replace(/\s+/g, ' ').trim(); + }); + outFileBase = slugify(title); + } + // Ajouter suffixe -book si en mode livre + if (bookMode) { + outFileBase += '-book'; + } + } + + // Wait for render readiness + if (wait === 'images' || wait === 'full') { + console.log('⏳ Waiting for images…'); + await waitForImages(page); + } + if (wait === 'd3' || wait === 'full') { + console.log('⏳ Waiting for D3…'); + await waitForD3(page); + } + if (wait === 'plotly' || wait === 'full') { + console.log('⏳ Waiting for Plotly…'); + await waitForPlotly(page); + } + if (wait === 'full') { + console.log('⏳ Waiting for stable layout…'); + await waitForStableLayout(page); + } + + // Mode livre : ouvrir tous les accordéons + if (bookMode) { + console.log('📂 Opening all accordions for book mode…'); + await page.evaluate(() => { + const accordions = document.querySelectorAll('details.accordion, details'); + accordions.forEach((accordion) => { + if (!accordion.hasAttribute('open')) { + accordion.setAttribute('open', ''); + const wrapper = accordion.querySelector('.accordion__content-wrapper'); + if (wrapper) { + wrapper.style.height = 'auto'; + wrapper.style.overflow = 'visible'; + } + } + }); + }); + await waitForStableLayout(page, 2000); + } + + await page.emulateMedia({ media: 'print' }); + + // Enforce responsive sizing for SVG/iframes by removing hard attrs and injecting CSS (top-level and inside same-origin iframes) + try { + await page.evaluate(() => { + function isSmallSvg(svg) { + try { + const vb = svg && svg.viewBox && svg.viewBox.baseVal ? svg.viewBox.baseVal : null; + if (vb && vb.width && vb.height && vb.width <= 50 && vb.height <= 50) return true; + const r = svg.getBoundingClientRect && svg.getBoundingClientRect(); + if (r && r.width && r.height && r.width <= 50 && r.height <= 50) return true; + } catch { } + return false; + } + function lockSmallSvgSize(svg) { + try { + const r = svg.getBoundingClientRect ? svg.getBoundingClientRect() : null; + const w = (r && r.width) ? Math.round(r.width) : null; + const h = (r && r.height) ? Math.round(r.height) : null; + if (w) svg.style.setProperty('width', w + 'px', 'important'); + if (h) svg.style.setProperty('height', h + 'px', 'important'); + svg.style.setProperty('max-width', 'none', 'important'); + } catch { } + } + function fixSvg(svg) { + if (!svg) return; + // Do not alter hero banner SVG sizing; it may rely on explicit width/height + try { if (svg.closest && svg.closest('.hero-banner')) return; } catch { } + if (isSmallSvg(svg)) { lockSmallSvgSize(svg); return; } + try { svg.removeAttribute('width'); } catch { } + try { svg.removeAttribute('height'); } catch { } + svg.style.maxWidth = '100%'; + svg.style.width = '100%'; + svg.style.height = 'auto'; + if (!svg.getAttribute('preserveAspectRatio')) svg.setAttribute('preserveAspectRatio', 'xMidYMid meet'); + } + document.querySelectorAll('svg').forEach(fixSvg); + document.querySelectorAll('.mermaid, .mermaid svg').forEach((el) => { + if (el.tagName && el.tagName.toLowerCase() === 'svg') fixSvg(el); + else { el.style.display = 'block'; el.style.width = '100%'; el.style.maxWidth = '100%'; } + }); + document.querySelectorAll('iframe, embed, object').forEach((el) => { + el.style.width = '100%'; + el.style.maxWidth = '100%'; + try { el.removeAttribute('width'); } catch { } + // Best-effort inject into same-origin frames + try { + const doc = (el.tagName.toLowerCase() === 'object' ? el.contentDocument : el.contentDocument); + if (doc && doc.head) { + const s = doc.createElement('style'); + s.textContent = 'html,body{overflow-x:hidden;} svg,canvas,img,video{max-width:100%!important;height:auto!important;} svg[width]{width:100%!important}'; + doc.head.appendChild(s); + doc.querySelectorAll('svg').forEach((svg) => { if (isSmallSvg(svg)) lockSmallSvgSize(svg); else fixSvg(svg); }); + } + } catch (_) { /* cross-origin; ignore */ } + }); + }); + } catch { } + + // Generate OG thumbnail (1200x630) + try { + const ogW = 1200, ogH = 630; + await page.setViewportSize({ width: ogW, height: ogH }); + // Give layout a tick to adjust + await page.waitForTimeout(200); + // Ensure layout & D3 re-rendered after viewport change + await page.evaluate(() => { window.scrollTo(0, 0); window.dispatchEvent(new Event('resize')); }); + try { await waitForD3(page, 8000); } catch { } + + // Temporarily improve visibility for light theme thumbnails + // - Force normal blend for points + // - Ensure an SVG background (CSS background on svg element) + const cssHandle = await page.addStyleTag({ + content: ` + .hero .points { mix-blend-mode: normal !important; } + ` }); + const thumbPath = resolve(cwd, 'dist', 'thumb.auto.jpg'); + await page.screenshot({ path: thumbPath, type: 'jpeg', quality: 85, fullPage: false }); + // Also emit PNG for compatibility if needed + const thumbPngPath = resolve(cwd, 'dist', 'thumb.auto.png'); + await page.screenshot({ path: thumbPngPath, type: 'png', fullPage: false }); + const publicThumb = resolve(cwd, 'public', 'thumb.auto.jpg'); + const publicThumbPng = resolve(cwd, 'public', 'thumb.auto.png'); + try { await fs.copyFile(thumbPath, publicThumb); } catch { } + try { await fs.copyFile(thumbPngPath, publicThumbPng); } catch { } + // Remove temporary style so PDF is unaffected + try { await cssHandle.evaluate((el) => el.remove()); } catch { } + console.log(`✅ OG thumbnail generated: ${thumbPath}`); + } catch (e) { + console.warn('Unable to generate OG thumbnail:', e?.message || e); + } + const outPath = resolve(cwd, 'dist', `${outFileBase}.pdf`); + // Restore viewport to printable width before PDF (thumbnail changed it) + try { + const fmt2 = getFormatSizeMm(format); + const mw2 = fmt2.w - cssLengthToMm(margin.left) - cssLengthToMm(margin.right); + const printableWidthPx2 = Math.max(320, Math.round((mw2 / 25.4) * 96)); + await page.setViewportSize({ width: printableWidthPx2, height: 1400 }); + await page.evaluate(() => { window.scrollTo(0, 0); window.dispatchEvent(new Event('resize')); }); + try { await waitForD3(page, 8000); } catch { } + await waitForStableLayout(page); + // Re-apply responsive fixes after viewport change + try { + await page.evaluate(() => { + function isSmallSvg(svg) { + try { + const vb = svg && svg.viewBox && svg.viewBox.baseVal ? svg.viewBox.baseVal : null; + if (vb && vb.width && vb.height && vb.width <= 50 && vb.height <= 50) return true; + const r = svg.getBoundingClientRect && svg.getBoundingClientRect(); + if (r && r.width && r.height && r.width <= 50 && r.height <= 50) return true; + } catch { } + return false; + } + function lockSmallSvgSize(svg) { + try { + const r = svg.getBoundingClientRect ? svg.getBoundingClientRect() : null; + const w = (r && r.width) ? Math.round(r.width) : null; + const h = (r && r.height) ? Math.round(r.height) : null; + if (w) svg.style.setProperty('width', w + 'px', 'important'); + if (h) svg.style.setProperty('height', h + 'px', 'important'); + svg.style.setProperty('max-width', 'none', 'important'); + } catch { } + } + function fixSvg(svg) { + if (!svg) return; + // Do not alter hero banner SVG sizing; it may rely on explicit width/height + try { if (svg.closest && svg.closest('.hero-banner')) return; } catch { } + if (isSmallSvg(svg)) { lockSmallSvgSize(svg); return; } + try { svg.removeAttribute('width'); } catch { } + try { svg.removeAttribute('height'); } catch { } + svg.style.maxWidth = '100%'; + svg.style.width = '100%'; + svg.style.height = 'auto'; + if (!svg.getAttribute('preserveAspectRatio')) svg.setAttribute('preserveAspectRatio', 'xMidYMid meet'); + } + document.querySelectorAll('svg').forEach((svg) => { if (isSmallSvg(svg)) lockSmallSvgSize(svg); else fixSvg(svg); }); + document.querySelectorAll('.mermaid, .mermaid svg').forEach((el) => { + if (el.tagName && el.tagName.toLowerCase() === 'svg') fixSvg(el); + else { el.style.display = 'block'; el.style.width = '100%'; el.style.maxWidth = '100%'; } + }); + document.querySelectorAll('iframe, embed, object').forEach((el) => { + el.style.width = '100%'; + el.style.maxWidth = '100%'; + try { el.removeAttribute('width'); } catch { } + try { + const doc = (el.tagName.toLowerCase() === 'object' ? el.contentDocument : el.contentDocument); + if (doc && doc.head) { + const s = doc.createElement('style'); + s.textContent = 'html,body{overflow-x:hidden;} svg,canvas,img,video{max-width:100%!important;height:auto!important;} svg[width]{width:100%!important}'; + doc.head.appendChild(s); + doc.querySelectorAll('svg').forEach((svg) => { if (isSmallSvg(svg)) lockSmallSvgSize(svg); else fixSvg(svg); }); + } + } catch (_) { } + }); + }); + } catch { } + } catch { } + + // Inject styles for PDF + let pdfCssHandle = null; + try { + if (bookMode) { + // Mode livre : injecter le CSS livre complet + console.log('📚 Applying book styles…'); + const bookCssPath = resolve(cwd, 'src', 'styles', '_print-book.css'); + const bookCss = await fs.readFile(bookCssPath, 'utf-8'); + pdfCssHandle = await page.addStyleTag({ content: bookCss }); + await page.waitForTimeout(500); + } else { + // Mode normal : styles responsive de base + pdfCssHandle = await page.addStyleTag({ + content: ` + /* General container safety */ + html, body { overflow-x: hidden !important; } + + /* Make all vector/bitmap media responsive for print */ + svg, canvas, img, video { max-width: 100% !important; height: auto !important; } + /* Mermaid diagrams */ + .mermaid, .mermaid svg { display: block; width: 100% !important; max-width: 100% !important; height: auto !important; } + /* Any explicit width attributes */ + svg[width] { width: 100% !important; } + /* Iframes and similar embeds */ + iframe, embed, object { width: 100% !important; max-width: 100% !important; height: auto; } + + /* HtmlEmbed wrappers (defensive) */ + .html-embed, .html-embed__card { max-width: 100% !important; width: 100% !important; } + .html-embed__card > div[id^="frag-"] { width: 100% !important; max-width: 100% !important; } + + /* Banner centering & visibility */ + .hero .points { mix-blend-mode: normal !important; } + /* Do NOT force a fixed height to avoid clipping in PDF */ + .hero-banner { width: 100% !important; max-width: 980px !important; margin-left: auto !important; margin-right: auto !important; } + /* Generalize banner styles for all banner types */ + .hero-banner svg, + .hero-banner canvas, + .hero-banner .d3-galaxy, + .hero-banner .threejs-galaxy, + .hero-banner .d3-latent-space, + .hero-banner .neural-flow, + .hero-banner .molecular-space, + .hero-banner [class*="banner"] { + width: 100% !important; + height: auto !important; + max-width: 980px !important; + } + ` }); + } + } catch { } + await page.pdf({ + path: outPath, + format, + printBackground: true, + displayHeaderFooter: false, + preferCSSPageSize: false, + margin: bookMode ? { + top: '20mm', + right: '20mm', + bottom: '25mm', + left: '25mm' + } : margin + }); + try { if (pdfCssHandle) await pdfCssHandle.evaluate((el) => el.remove()); } catch { } + console.log(`✅ PDF generated: ${outPath}`); + + // Copy into public only under the slugified name + const publicSlugPath = resolve(cwd, 'public', `${outFileBase}.pdf`); + try { + await fs.mkdir(resolve(cwd, 'public'), { recursive: true }); + await fs.copyFile(outPath, publicSlugPath); + console.log(`✅ PDF copied to: ${publicSlugPath}`); + } catch (e) { + console.warn('Unable to copy PDF to public/:', e?.message || e); + } + } finally { + await browser.close(); + } + } finally { + // Try a clean shutdown of preview (entire process group first) + try { + if (process.platform !== 'win32') { + try { process.kill(-preview.pid, 'SIGINT'); } catch { } + } + try { preview.kill('SIGINT'); } catch { } + await Promise.race([previewExit, delay(3000)]); + // Force kill if still alive + // eslint-disable-next-line no-unsafe-optional-chaining + if (!preview.killed) { + try { + if (process.platform !== 'win32') { + try { process.kill(-preview.pid, 'SIGKILL'); } catch { } + } + try { preview.kill('SIGKILL'); } catch { } + } catch { } + await Promise.race([previewExit, delay(1000)]); + } + } catch { } + } +} + +main().catch((err) => { + console.error(err); + process.exit(1); +}); + + diff --git a/app/scripts/generate-trackio-data.mjs b/app/scripts/generate-trackio-data.mjs new file mode 100644 index 0000000000000000000000000000000000000000..cbac5cb711cd5765e00c985d5c92d8eb1251631c --- /dev/null +++ b/app/scripts/generate-trackio-data.mjs @@ -0,0 +1,196 @@ +#!/usr/bin/env node + +// Generate synthetic Trackio-like CSV data with realistic ML curves. +// - Steps are simple integers (e.g., 1..N) +// - Metrics: epoch, train_accuracy, val_accuracy, train_loss, val_loss +// - W&B-like run names (e.g., pleasant-flower-1) +// - Deterministic with --seed +// +// Usage: +// node app/scripts/generate-trackio-data.mjs \ +// --runs 3 \ +// --steps 10 \ +// --out app/src/content/assets/data/trackio_wandb_synth.csv \ +// [--seed 42] [--epoch-max 3.0] [--amount 1.0] [--start 1] +// +// To overwrite the demo file used by the embed: +// node app/scripts/generate-trackio-data.mjs --runs 3 --steps 10 --out app/src/content/assets/data/trackio_wandb_demo.csv --seed 1337 + +import fs from 'node:fs/promises'; +import path from 'node:path'; + +function parseArgs(argv){ + const args = { runs: 3, steps: 10, out: '', seed: undefined, epochMax: 3.0, amount: 1, start: 1 }; + for (let i = 2; i < argv.length; i++){ + const a = argv[i]; + if (a === '--runs' && argv[i+1]) { args.runs = Math.max(1, parseInt(argv[++i], 10) || 3); continue; } + if (a === '--steps' && argv[i+1]) { args.steps = Math.max(2, parseInt(argv[++i], 10) || 10); continue; } + if (a === '--out' && argv[i+1]) { args.out = argv[++i]; continue; } + if (a === '--seed' && argv[i+1]) { args.seed = Number(argv[++i]); continue; } + if (a === '--epoch-max' && argv[i+1]) { args.epochMax = Number(argv[++i]) || 3.0; continue; } + if (a === '--amount' && argv[i+1]) { args.amount = Number(argv[++i]) || 1.0; continue; } + if (a === '--start' && argv[i+1]) { args.start = parseInt(argv[++i], 10) || 1; continue; } + } + if (!args.out) { + args.out = path.join('app', 'src', 'content', 'assets', 'data', 'trackio_wandb_synth.csv'); + } + return args; +} + +function mulberry32(seed){ + let t = seed >>> 0; + return function(){ + t += 0x6D2B79F5; + let r = Math.imul(t ^ (t >>> 15), 1 | t); + r ^= r + Math.imul(r ^ (r >>> 7), 61 | r); + return ((r ^ (r >>> 14)) >>> 0) / 4294967296; + }; +} + +function makeRng(seed){ + if (Number.isFinite(seed)) return mulberry32(seed); + return Math.random; +} + +function randn(rng){ + // Box-Muller transform + let u = 0, v = 0; + while (u === 0) u = rng(); + while (v === 0) v = rng(); + return Math.sqrt(-2.0 * Math.log(u)) * Math.cos(2.0 * Math.PI * v); +} + +function clamp(x, lo, hi){ + return Math.max(lo, Math.min(hi, x)); +} + +function logistic(t, k=6, x0=0.5){ + // 1 / (1 + e^{-k (t - x0)}) in [0,1] + return 1 / (1 + Math.exp(-k * (t - x0))); +} + +function expDecay(t, k=3){ + // (1 - e^{-k t}) in [0,1] + return 1 - Math.exp(-k * t); +} + +function pick(array, rng){ + return array[Math.floor(rng() * array.length) % array.length]; +} + +function buildRunNames(count, rng){ + const adjectives = [ + 'pleasant','brisk','silent','ancient','bold','gentle','rapid','shy','curious','lively', + 'fearless','soothing','glossy','hidden','misty','bright','calm','keen','noble','swift' + ]; + const nouns = [ + 'flower','glade','sky','river','forest','ember','comet','meadow','harbor','dawn', + 'mountain','prairie','breeze','valley','lagoon','desert','monsoon','reef','thunder','willow' + ]; + const names = new Set(); + let attempts = 0; + while (names.size < count && attempts < count * 20){ + attempts++; + const left = pick(adjectives, rng); + const right = pick(nouns, rng); + const idx = 1 + Math.floor(rng() * 9); + names.add(`${left}-${right}-${idx}`); + } + return Array.from(names); +} + +function formatLike(value, decimals){ + return Number.isFinite(decimals) && decimals >= 0 ? value.toFixed(decimals) : String(value); +} + +async function main(){ + const args = parseArgs(process.argv); + const rng = makeRng(args.seed); + + // Steps: integers from start .. start+steps-1 + const steps = Array.from({ length: args.steps }, (_, i) => args.start + i); + const stepNorm = (i) => (i - steps[0]) / (steps[steps.length-1] - steps[0]); + + const runs = buildRunNames(args.runs, rng); + + // Per-run slight variations + const runParams = runs.map((_r, idx) => { + const r = rng(); + // Final accuracies + const trainAccFinal = clamp(0.86 + (r - 0.5) * 0.12 * args.amount, 0.78, 0.97); + const valAccFinal = clamp(trainAccFinal - (0.02 + rng() * 0.05), 0.70, 0.95); + // Loss plateau + const lossStart = 7.0 + (rng() - 0.5) * 0.10 * args.amount; // ~7.0 ±0.05 + const lossPlateau = 6.78 + (rng() - 0.5) * 0.04 * args.amount; // ~6.78 ±0.02 + const lossK = 2.0 + rng() * 1.5; // decay speed + // Acc growth steepness and midpoint + const kAcc = 4.5 + rng() * 3.0; + const x0Acc = 0.35 + rng() * 0.25; + return { trainAccFinal, valAccFinal, lossStart, lossPlateau, lossK, kAcc, x0Acc }; + }); + + const lines = []; + lines.push('run,step,metric,value,stderr'); + + // EPOCH: linear 0..epochMax across steps + for (let r = 0; r < runs.length; r++){ + const run = runs[r]; + for (let i = 0; i < steps.length; i++){ + const t = stepNorm(steps[i]); + const epoch = args.epochMax * t; + lines.push(`${run},${steps[i]},epoch,${formatLike(epoch, 2)},`); + } + } + + // TRAIN LOSS & VAL LOSS + for (let r = 0; r < runs.length; r++){ + const run = runs[r]; + const p = runParams[r]; + let prevTrain = null; + let prevVal = null; + for (let i = 0; i < steps.length; i++){ + const t = stepNorm(steps[i]); + const d = expDecay(t, p.lossK); // 0..1 + let trainLoss = p.lossStart - (p.lossStart - p.lossPlateau) * d; + let valLoss = trainLoss + 0.02 + (rng() * 0.03); + // Add mild noise + trainLoss += randn(rng) * 0.01 * args.amount; + valLoss += randn(rng) * 0.012 * args.amount; + // Keep reasonable and mostly monotonic (small upward blips allowed) + if (prevTrain != null) trainLoss = Math.min(prevTrain + 0.01, trainLoss); + if (prevVal != null) valLoss = Math.min(prevVal + 0.012, valLoss); + prevTrain = trainLoss; prevVal = valLoss; + const stderrTrain = clamp(0.03 - 0.02 * t + Math.abs(randn(rng)) * 0.003, 0.006, 0.04); + const stderrVal = clamp(0.035 - 0.022 * t + Math.abs(randn(rng)) * 0.003, 0.008, 0.045); + lines.push(`${run},${steps[i]},train_loss,${formatLike(trainLoss, 3)},${formatLike(stderrTrain, 3)}`); + lines.push(`${run},${steps[i]},val_loss,${formatLike(valLoss, 3)},${formatLike(stderrVal, 3)}`); + } + } + + // TRAIN ACCURACY & VAL ACCURACY (logistic) + for (let r = 0; r < runs.length; r++){ + const run = runs[r]; + const p = runParams[r]; + for (let i = 0; i < steps.length; i++){ + const t = stepNorm(steps[i]); + const accBase = logistic(t, p.kAcc, p.x0Acc); + let trainAcc = clamp(0.55 + accBase * (p.trainAccFinal - 0.55), 0, 1); + let valAcc = clamp(0.52 + accBase * (p.valAccFinal - 0.52), 0, 1); + // Gentle noise + trainAcc = clamp(trainAcc + randn(rng) * 0.005 * args.amount, 0, 1); + valAcc = clamp(valAcc + randn(rng) * 0.006 * args.amount, 0, 1); + const stderrTrain = clamp(0.02 - 0.011 * t + Math.abs(randn(rng)) * 0.002, 0.006, 0.03); + const stderrVal = clamp(0.022 - 0.012 * t + Math.abs(randn(rng)) * 0.002, 0.007, 0.032); + lines.push(`${run},${steps[i]},train_accuracy,${formatLike(trainAcc, 4)},${formatLike(stderrTrain, 3)}`); + lines.push(`${run},${steps[i]},val_accuracy,${formatLike(valAcc, 4)},${formatLike(stderrVal, 3)}`); + } + } + + // Ensure directory exists + await fs.mkdir(path.dirname(args.out), { recursive: true }); + await fs.writeFile(args.out, lines.join('\n') + '\n', 'utf8'); + const relOut = path.relative(process.cwd(), args.out); + console.log(`Synthetic CSV generated: ${relOut}`); +} + +main().catch(err => { console.error(err?.stack || String(err)); process.exit(1); }); diff --git a/app/scripts/jitter-trackio-data.mjs b/app/scripts/jitter-trackio-data.mjs new file mode 100644 index 0000000000000000000000000000000000000000..ed09c7f702f5a0f4ada98a90313a449e04debee8 --- /dev/null +++ b/app/scripts/jitter-trackio-data.mjs @@ -0,0 +1,129 @@ +#!/usr/bin/env node + +// Jitter Trackio CSV data with small, controlled noise. +// - Preserves comments (# ...) and blank lines +// - Leaves 'epoch' values unchanged +// - Adds mild noise to train/val accuracy (clamped to [0,1]) +// - Adds mild noise to train/val loss (kept >= 0) +// - Keeps steps untouched +// Usage: +// node app/scripts/jitter-trackio-data.mjs \ +// --in app/src/content/assets/data/trackio_wandb_demo.csv \ +// --out app/src/content/assets/data/trackio_wandb_demo.jitter.csv \ +// [--seed 42] [--amount 1.0] [--in-place] + +import fs from 'node:fs/promises'; +import path from 'node:path'; + +function parseArgs(argv){ + const args = { in: '', out: '', seed: undefined, amount: 1, inPlace: false }; + for (let i = 2; i < argv.length; i++){ + const a = argv[i]; + if (a === '--in' && argv[i+1]) { args.in = argv[++i]; continue; } + if (a === '--out' && argv[i+1]) { args.out = argv[++i]; continue; } + if (a === '--seed' && argv[i+1]) { args.seed = Number(argv[++i]); continue; } + if (a === '--amount' && argv[i+1]) { args.amount = Number(argv[++i]) || 3; continue; } + if (a === '--in-place') { args.inPlace = true; continue; } + } + if (!args.in) throw new Error('--in is required'); + if (args.inPlace) args.out = args.in; + if (!args.out) { + const { dir, name, ext } = path.parse(args.in); + args.out = path.join(dir, `${name}.jitter${ext || '.csv'}`); + } + return args; +} + +function mulberry32(seed){ + let t = seed >>> 0; + return function(){ + t += 0x6D2B79F5; + let r = Math.imul(t ^ (t >>> 15), 1 | t); + r ^= r + Math.imul(r ^ (r >>> 7), 61 | r); + return ((r ^ (r >>> 14)) >>> 0) / 4294967296; + }; +} + +function makeRng(seed){ + if (Number.isFinite(seed)) return mulberry32(seed); + return Math.random; +} + +function randn(rng){ + // Box-Muller transform + let u = 0, v = 0; + while (u === 0) u = rng(); + while (v === 0) v = rng(); + return Math.sqrt(-2.0 * Math.log(u)) * Math.cos(2.0 * Math.PI * v); +} + +function jitterValue(metric, value, amount, rng){ + const m = metric.toLowerCase(); + if (m === 'epoch') return value; // keep as-is + if (m.includes('accuracy')){ + const n = Math.max(-0.02 * amount, Math.min(0.02 * amount, randn(rng) * 0.01 * amount)); + return Math.max(0, Math.min(1, value + n)); + } + if (m.includes('loss')){ + const n = Math.max(-0.03 * amount, Math.min(0.03 * amount, randn(rng) * 0.01 * amount)); + return Math.max(0, value + n); + } + // default: tiny noise + const n = Math.max(-0.01 * amount, Math.min(0.01 * amount, randn(rng) * 0.005 * amount)); + return value + n; +} + +function formatNumberLike(original, value){ + const s = String(original); + const dot = s.indexOf('.') + const decimals = dot >= 0 ? (s.length - dot - 1) : 0; + if (!Number.isFinite(value)) return s; + if (decimals <= 0) return String(Math.round(value)); + return value.toFixed(decimals); +} + +async function main(){ + const args = parseArgs(process.argv); + const rng = makeRng(args.seed); + const raw = await fs.readFile(args.in, 'utf8'); + const lines = raw.split(/\r?\n/); + const out = new Array(lines.length); + + for (let i = 0; i < lines.length; i++){ + const line = lines[i]; + if (!line || line.trim().length === 0) { out[i] = line; continue; } + if (/^\s*#/.test(line)) { out[i] = line; continue; } + + // Preserve header line unmodified + if (i === 0 && /^\s*run\s*,\s*step\s*,\s*metric\s*,\s*value\s*,\s*stderr\s*$/i.test(line)) { + out[i] = line; continue; + } + + const cols = line.split(','); + if (cols.length < 4) { out[i] = line; continue; } + + const [run, stepStr, metric, valueStr, stderrStr = ''] = cols; + const trimmedMetric = (metric || '').trim(); + const valueNum = Number((valueStr || '').trim()); + + if (!Number.isFinite(valueNum)) { out[i] = line; continue; } + + const jittered = jitterValue(trimmedMetric, valueNum, args.amount, rng); + const valueOut = formatNumberLike(valueStr, jittered); + + // Reassemble with original column count and positions + const result = [run, stepStr, metric, valueOut, stderrStr].join(','); + out[i] = result; + } + + const finalText = out.join('\n'); + await fs.writeFile(args.out, finalText, 'utf8'); + const relIn = path.relative(process.cwd(), args.in); + const relOut = path.relative(process.cwd(), args.out); + console.log(`Jittered data written: ${relOut} (from ${relIn})`); +} + +main().catch(err => { + console.error(err?.stack || String(err)); + process.exit(1); +}); diff --git a/app/scripts/latex-importer/README.md b/app/scripts/latex-importer/README.md new file mode 100644 index 0000000000000000000000000000000000000000..4c8a36f3739569c7d033658e82937ad2da5422e6 --- /dev/null +++ b/app/scripts/latex-importer/README.md @@ -0,0 +1,169 @@ +# LaTeX Importer + +Complete LaTeX to MDX (Markdown + JSX) importer optimized for Astro with advanced support for references, interactive equations, and components. + +## 🚀 Quick Start + +```bash +# Complete LaTeX → MDX conversion with all features +node index.mjs + +# For step-by-step debugging +node latex-converter.mjs # LaTeX → Markdown +node mdx-converter.mjs # Markdown → MDX +``` + +## 📁 Structure + +``` +latex-importer/ +├── index.mjs # Complete LaTeX → MDX pipeline +├── latex-converter.mjs # LaTeX → Markdown with Pandoc +├── mdx-converter.mjs # Markdown → MDX with Astro components +├── reference-preprocessor.mjs # LaTeX references cleanup +├── post-processor.mjs # Markdown post-processing +├── bib-cleaner.mjs # Bibliography cleaner +├── filters/ +│ └── equation-ids.lua # Pandoc filter for KaTeX equations +├── input/ # LaTeX sources +│ ├── main.tex +│ ├── main.bib +│ └── sections/ +└── output/ # Results + ├── main.md # Intermediate Markdown + └── main.mdx # Final MDX for Astro +``` + +## ✨ Key Features + +### 🎯 **Smart References** +- **Invisible anchors**: Automatic conversion of `\label{}` to `` +- **Clean links**: Identifier cleanup (`:` → `-`, removing prefixes `sec:`, `fig:`, `eq:`) +- **Cross-references**: Full support for `\ref{}` with functional links + +### 🧮 **Interactive Equations** +- **KaTeX IDs**: Conversion of `\label{eq:...}` to `\htmlId{id}{equation}` +- **Equation references**: Clickable links to mathematical equations +- **Advanced KaTeX support**: `trust: true` configuration for `\htmlId{}` + +### 🎨 **Automatic Styling** +- **Highlights**: `\highlight{text}` → `text` +- **Auto cleanup**: Removal of numbering `(1)`, `(2)`, etc. +- **Astro components**: Images → `Figure` with automatic imports + +### 🔧 **Robust Pipeline** +- **LaTeX preprocessor**: Reference cleanup before Pandoc +- **Lua filter**: Equation processing in Pandoc AST +- **Post-processor**: Markdown cleanup and optimization +- **MDX converter**: Final transformation with Astro components + +## 📊 Example Workflow + +```bash +# 1. Prepare LaTeX sources +cp my-paper/* input/ + +# 2. Complete automatic conversion +node index.mjs + +# 3. Generated results +ls output/ +# → main.md (Intermediate Markdown) +# → main.mdx (Final MDX for Astro) +# → assets/image/ (extracted images) +``` + +### 📋 Conversion Result + +The pipeline generates an MDX file optimized for Astro with: + +```mdx +--- +title: "Your Article Title" +description: "Generated from LaTeX" +--- + +import Figure from '../components/Figure.astro'; +import figure1 from '../assets/image/figure1.png'; + +## Section with invisible anchor + + +Here is some text with highlighted words. + +Reference to an interactive [equation](#equation-name). + +Equation with KaTeX ID: +$$\htmlId{equation-name}{E = mc^2}$$ + +
        +``` + +## ⚙️ Required Astro Configuration + +To use equations with IDs, add to `astro.config.mjs`: + +```javascript +import rehypeKatex from 'rehype-katex'; + +export default defineConfig({ + markdown: { + rehypePlugins: [ + [rehypeKatex, { trust: true }], // ← Important for \htmlId{} + ], + }, +}); +``` + +## 🛠️ Prerequisites + +- **Node.js** with ESM support +- **Pandoc** (`brew install pandoc`) +- **Astro** to use the generated MDX + +## 🎯 Technical Architecture + +### 4-Stage Pipeline + +1. **LaTeX Preprocessing** (`reference-preprocessor.mjs`) + - Cleanup of `\label{}` and `\ref{}` + - Conversion `\highlight{}` → CSS spans + - Removal of prefixes and problematic characters + +2. **Pandoc + Lua Filter** (`equation-ids.lua`) + - LaTeX → Markdown conversion with `gfm+tex_math_dollars+raw_html` + - Equation processing: `\label{eq:name}` → `\htmlId{name}{equation}` + - Automatic image extraction + +3. **Markdown Post-processing** (`post-processor.mjs`) + - KaTeX, Unicode, grouping commands cleanup + - Attribute correction with `:` + - Code snippet injection + +4. **MDX Conversion** (`mdx-converter.mjs`) + - Images transformation → `Figure` + - HTML span escaping correction + - Automatic imports generation + - MDX frontmatter + +## 📊 Conversion Statistics + +For a typical scientific document: +- **87 labels** detected and processed +- **48 invisible anchors** created +- **13 highlight spans** with CSS class +- **4 equations** with `\htmlId{}` KaTeX +- **40 images** converted to components + +## ✅ Project Status + +### 🎉 **Complete Features** +- ✅ **LaTeX → MDX Pipeline**: Full end-to-end functional conversion +- ✅ **Cross-document references**: Perfectly functional internal links +- ✅ **Interactive equations**: KaTeX support with clickable IDs +- ✅ **Automatic styling**: Highlights and Astro components +- ✅ **Robustness**: Automatic cleanup of all escaping +- ✅ **Optimization**: Clean code without unnecessary elements + +### 🚀 **Production Ready** +The toolkit is now **100% operational** for converting complex scientific LaTeX documents to MDX/Astro with all advanced features (references, interactive equations, styling). diff --git a/app/scripts/latex-importer/bib-cleaner.mjs b/app/scripts/latex-importer/bib-cleaner.mjs new file mode 100644 index 0000000000000000000000000000000000000000..4fb409a3838a1274770f41fc8b2a1457fa7de45d --- /dev/null +++ b/app/scripts/latex-importer/bib-cleaner.mjs @@ -0,0 +1,104 @@ +#!/usr/bin/env node + +import { readFileSync, writeFileSync, existsSync } from 'fs'; +import { join, dirname, basename } from 'path'; + +/** + * Clean a BibTeX file by removing local file references and paths + * @param {string} inputBibFile - Path to the input .bib file + * @param {string} outputBibFile - Path to the output cleaned .bib file + * @returns {boolean} - Success status + */ +export function cleanBibliography(inputBibFile, outputBibFile) { + if (!existsSync(inputBibFile)) { + console.log(' ⚠️ No bibliography file found:', inputBibFile); + return false; + } + + console.log('📚 Cleaning bibliography...'); + let bibContent = readFileSync(inputBibFile, 'utf8'); + + // Remove file paths and local references + bibContent = bibContent.replace(/file = \{[^}]+\}/g, ''); + + // Remove empty lines created by file removal + bibContent = bibContent.replace(/,\s*\n\s*\n/g, '\n\n'); + bibContent = bibContent.replace(/,\s*\}/g, '\n}'); + + // Clean up double commas + bibContent = bibContent.replace(/,,/g, ','); + + // Remove trailing commas before closing braces + bibContent = bibContent.replace(/,(\s*\n\s*)\}/g, '$1}'); + + writeFileSync(outputBibFile, bibContent); + console.log(` 📄 Clean bibliography saved: ${outputBibFile}`); + + return true; +} + +/** + * CLI for bibliography cleaning + */ +function main() { + const args = process.argv.slice(2); + + if (args.includes('--help') || args.includes('-h')) { + console.log(` +📚 BibTeX Bibliography Cleaner + +Usage: + node bib-cleaner.mjs [input.bib] [output.bib] + node bib-cleaner.mjs --input=input.bib --output=output.bib + +Options: + --input=FILE Input .bib file + --output=FILE Output cleaned .bib file + --help, -h Show this help + +Examples: + # Clean main.bib to clean.bib + node bib-cleaner.mjs main.bib clean.bib + + # Using flags + node bib-cleaner.mjs --input=references.bib --output=clean-refs.bib +`); + process.exit(0); + } + + let inputFile, outputFile; + + // Parse command line arguments + if (args.length >= 2 && !args[0].startsWith('--')) { + // Positional arguments + inputFile = args[0]; + outputFile = args[1]; + } else { + // Named arguments + for (const arg of args) { + if (arg.startsWith('--input=')) { + inputFile = arg.split('=')[1]; + } else if (arg.startsWith('--output=')) { + outputFile = arg.split('=')[1]; + } + } + } + + if (!inputFile || !outputFile) { + console.error('❌ Both input and output files are required'); + console.log('Use --help for usage information'); + process.exit(1); + } + + const success = cleanBibliography(inputFile, outputFile); + if (success) { + console.log('🎉 Bibliography cleaning completed!'); + } else { + process.exit(1); + } +} + +// Run CLI if called directly +if (import.meta.url === `file://${process.argv[1]}`) { + main(); +} diff --git a/app/scripts/latex-importer/filters/equation-ids.lua b/app/scripts/latex-importer/filters/equation-ids.lua new file mode 100644 index 0000000000000000000000000000000000000000..c07e21b001b4686324a974dae06c9f3093a540e9 --- /dev/null +++ b/app/scripts/latex-importer/filters/equation-ids.lua @@ -0,0 +1,134 @@ +--[[ +Pandoc Lua filter to add IDs to equations using KaTeX \htmlId syntax + +This filter processes display math equations and inline math that contain +\label{} commands, and wraps them with \htmlId{clean-id}{content} for KaTeX. + +Requirements: +- KaTeX renderer with trust: true option +- Equations with \label{} commands in LaTeX +--]] + +-- Function to clean identifier strings (remove prefixes and colons) +function clean_identifier(id_str) + if id_str and type(id_str) == "string" then + -- Remove common prefixes and replace colons with dashes + local clean = id_str + :gsub("^(eq|equation):", "") -- Remove eq: prefix + :gsub(":", "-") -- Replace colons with dashes + :gsub("[^a-zA-Z0-9_-]", "-") -- Replace other problematic chars + :gsub("-+", "-") -- Collapse multiple dashes + :gsub("^-", "") -- Remove leading dash + :gsub("-$", "") -- Remove trailing dash + + -- Ensure we don't have empty identifiers + if clean == "" then + clean = id_str:gsub(":", "-") + end + + return clean + end + return id_str +end + +-- Process Math elements (both inline and display) +function Math(el) + local math_content = el.text + + -- Look for \label{...} commands in the math content + local label_match = math_content:match("\\label%{([^}]+)%}") + + if label_match then + -- Clean the identifier + local clean_id = clean_identifier(label_match) + + -- Remove the \label{} command from the math content + local clean_math = math_content:gsub("\\label%{[^}]+%}", "") + + -- Clean up any extra whitespace or line breaks that might remain + clean_math = clean_math:gsub("%s*$", ""):gsub("^%s*", "") + + -- Handle different equation environments appropriately + -- For align environments, preserve them as they work with KaTeX + local has_align = clean_math:match("\\begin%{align%}") + + if has_align then + -- For align environments, we keep the structure and add ID as an attribute + -- KaTeX supports align environments natively + clean_math = clean_math:gsub("\\begin%{align%}", "\\begin{align}") + clean_math = clean_math:gsub("\\end%{align%}", "\\end{align}") + else + -- Remove other equation environments that don't work well with \htmlId + clean_math = clean_math:gsub("\\begin%{equation%}", ""):gsub("\\end%{equation%}", "") + clean_math = clean_math:gsub("\\begin%{equation%*%}", ""):gsub("\\end%{equation%*%}", "") + clean_math = clean_math:gsub("\\begin%{align%*%}", ""):gsub("\\end%{align%*%}", "") + end + + -- Clean up any remaining whitespace + clean_math = clean_math:gsub("%s*$", ""):gsub("^%s*", "") + + local new_math + if has_align then + -- For align environments, KaTeX doesn't support \htmlId with align + -- Instead, we add a special marker that the post-processor will convert to a span + -- This span will serve as an anchor for references + new_math = "%%ALIGN_ANCHOR_ID{" .. clean_id .. "}%%\n" .. clean_math + else + -- For other math, wrap with \htmlId{} + new_math = "\\htmlId{" .. clean_id .. "}{" .. clean_math .. "}" + end + + -- Return new Math element with the updated content + return pandoc.Math(el.mathtype, new_math) + end + + -- Return unchanged if no label found + return el +end + +-- Optional: Process RawInline elements that might contain LaTeX math +function RawInline(el) + if el.format == "latex" or el.format == "tex" then + local content = el.text + + -- Look for equation environments with labels + local label_match = content:match("\\label%{([^}]+)%}") + + if label_match then + local clean_id = clean_identifier(label_match) + + -- For raw LaTeX, we might need different handling + -- This is a simplified approach - adjust based on your needs + local clean_content = content:gsub("\\label%{[^}]+%}", "") + + if clean_content:match("\\begin%{equation") or clean_content:match("\\begin%{align") then + -- For equation environments, we might need to wrap differently + -- This depends on how your KaTeX setup handles equation environments + return pandoc.RawInline(el.format, clean_content) + end + end + end + + return el +end + +-- Optional: Process RawBlock elements for display equations +function RawBlock(el) + if el.format == "latex" or el.format == "tex" then + local content = el.text + + -- Look for equation environments with labels + local label_match = content:match("\\label%{([^}]+)%}") + + if label_match then + local clean_id = clean_identifier(label_match) + local clean_content = content:gsub("\\label%{[^}]+%}", "") + + -- For block equations, we might want to preserve the structure + -- but add the htmlId functionality + return pandoc.RawBlock(el.format, clean_content) + end + end + + return el +end diff --git a/app/scripts/latex-importer/index.mjs b/app/scripts/latex-importer/index.mjs new file mode 100644 index 0000000000000000000000000000000000000000..9cdb8e0ba583b8fe4ac1e8ad9f6a187be69884fb --- /dev/null +++ b/app/scripts/latex-importer/index.mjs @@ -0,0 +1,138 @@ +#!/usr/bin/env node + +import { join, dirname } from 'path'; +import { fileURLToPath } from 'url'; +import { copyFileSync } from 'fs'; +import { convertLatexToMarkdown } from './latex-converter.mjs'; +import { convertToMdx } from './mdx-converter.mjs'; +import { cleanBibliography } from './bib-cleaner.mjs'; + +const __filename = fileURLToPath(import.meta.url); +const __dirname = dirname(__filename); + +// Default configuration +const DEFAULT_INPUT = join(__dirname, 'input', 'main.tex'); +const DEFAULT_OUTPUT = join(__dirname, 'output'); +const ASTRO_CONTENT_PATH = join(__dirname, '..', '..', 'src', 'content', 'article.mdx'); + +function parseArgs() { + const args = process.argv.slice(2); + const config = { + input: DEFAULT_INPUT, + output: DEFAULT_OUTPUT, + clean: false, + bibOnly: false, + convertOnly: false, + mdx: false, + }; + + for (const arg of args) { + if (arg.startsWith('--input=')) { + config.input = arg.split('=')[1]; + } else if (arg.startsWith('--output=')) { + config.output = arg.split('=')[1]; + } else if (arg === '--clean') { + config.clean = true; + } else if (arg === '--bib-only') { + config.bibOnly = true; + } else if (arg === '--convert-only') { + config.convertOnly = true; + } + } + + return config; +} + +function showHelp() { + console.log(` +🚀 LaTeX to Markdown Toolkit + +Usage: + node index.mjs [options] + +Options: + --input=PATH Input LaTeX file (default: input/main.tex) + --output=PATH Output directory (default: output/) + --clean Clean output directory before processing + --bib-only Only clean bibliography file + --convert-only Only convert LaTeX to Markdown (skip bib cleaning) + --help, -h Show this help + +Examples: + # Full conversion with bibliography cleaning + node index.mjs --clean + + # Only clean bibliography + node index.mjs --bib-only --input=paper.tex --output=clean/ + + # Only convert LaTeX (use existing clean bibliography) + node index.mjs --convert-only + + # Custom paths + node index.mjs --input=../paper/main.tex --output=../results/ --clean +`); +} + +function main() { + const args = process.argv.slice(2); + + if (args.includes('--help') || args.includes('-h')) { + showHelp(); + process.exit(0); + } + + const config = parseArgs(); + + console.log('🚀 LaTeX to Markdown Toolkit'); + console.log('=============================='); + + try { + if (config.bibOnly) { + // Only clean bibliography + console.log('📚 Bibliography cleaning mode'); + const bibInput = config.input.replace('.tex', '.bib'); + const bibOutput = join(config.output, 'main.bib'); + + cleanBibliography(bibInput, bibOutput); + console.log('🎉 Bibliography cleaning completed!'); + + } else if (config.convertOnly) { + // Only convert LaTeX + console.log('📄 Conversion only mode'); + convertLatexToMarkdown(config.input, config.output); + + } else { + // Full workflow + console.log('🔄 Full conversion workflow'); + convertLatexToMarkdown(config.input, config.output); + + // Convert to MDX if requested + const markdownFile = join(config.output, 'main.md'); + const mdxFile = join(config.output, 'main.mdx'); + + console.log('📝 Converting Markdown to MDX...'); + convertToMdx(markdownFile, mdxFile); + + // Copy MDX to Astro content directory + console.log('📋 Copying MDX to Astro content directory...'); + try { + copyFileSync(mdxFile, ASTRO_CONTENT_PATH); + console.log(` ✅ Copied to ${ASTRO_CONTENT_PATH}`); + } catch (error) { + console.warn(` ⚠️ Failed to copy MDX to Astro: ${error.message}`); + } + } + + } catch (error) { + console.error('❌ Error:', error.message); + process.exit(1); + } +} + +// Export functions for use as module +export { convertLatexToMarkdown, cleanBibliography }; + +// Run CLI if called directly +if (import.meta.url === `file://${process.argv[1]}`) { + main(); +} diff --git a/app/scripts/latex-importer/latex-converter.mjs b/app/scripts/latex-importer/latex-converter.mjs new file mode 100644 index 0000000000000000000000000000000000000000..7079e2e43b85e9947a771a33a9ef22adb329f35a --- /dev/null +++ b/app/scripts/latex-importer/latex-converter.mjs @@ -0,0 +1,330 @@ +#!/usr/bin/env node + +import { execSync } from 'child_process'; +import { readFileSync, writeFileSync, existsSync, mkdirSync } from 'fs'; +import { join, dirname, basename } from 'path'; +import { fileURLToPath } from 'url'; +import { cleanBibliography } from './bib-cleaner.mjs'; +import { postProcessMarkdown } from './post-processor.mjs'; +import { preprocessLatexReferences } from './reference-preprocessor.mjs'; + +const __filename = fileURLToPath(import.meta.url); +const __dirname = dirname(__filename); + +// Configuration +const DEFAULT_INPUT = join(__dirname, 'input', 'main.tex'); +const DEFAULT_OUTPUT = join(__dirname, 'output'); + +function parseArgs() { + const args = process.argv.slice(2); + const config = { + input: DEFAULT_INPUT, + output: DEFAULT_OUTPUT, + clean: false + }; + + for (const arg of args) { + if (arg.startsWith('--input=')) { + config.input = arg.split('=')[1]; + } else if (arg.startsWith('--output=')) { + config.output = arg.split('=')[1]; + } else if (arg === '--clean') { + config.clean = true; + } + } + + return config; +} + +function ensureDirectory(dir) { + if (!existsSync(dir)) { + mkdirSync(dir, { recursive: true }); + } +} + +function cleanDirectory(dir) { + if (existsSync(dir)) { + execSync(`rm -rf "${dir}"/*`, { stdio: 'inherit' }); + } +} + +function preprocessLatexFile(inputFile, outputDir) { + const inputDir = dirname(inputFile); + const tempFile = join(outputDir, 'temp_main.tex'); + + console.log('🔄 Preprocessing LaTeX file to resolve \\input commands...'); + + let content = readFileSync(inputFile, 'utf8'); + + // Remove problematic commands that break pandoc + console.log('🧹 Cleaning problematic LaTeX constructs...'); + + // Fix citation issues - but not in citation keys + content = content.replace(/\$p_0\$(?![A-Za-z])/g, 'p0'); + + // Convert complex math environments to simple delimiters + content = content.replace(/\$\$\\begin\{equation\*\}/g, '$$'); + content = content.replace(/\\end\{equation\*\}\$\$/g, '$$'); + content = content.replace(/\\begin\{equation\*\}/g, '$$'); + content = content.replace(/\\end\{equation\*\}/g, '$$'); + // Keep align environments intact for KaTeX support + // Protect align environments by temporarily replacing them before cleaning & operators + const alignBlocks = []; + content = content.replace(/\\begin\{align\}([\s\S]*?)\\end\{align\}/g, (match, alignContent) => { + alignBlocks.push(match); + return `__ALIGN_BLOCK_${alignBlocks.length - 1}__`; + }); + + // Now remove & operators from non-align content (outside align environments) + content = content.replace(/&=/g, '='); + content = content.replace(/&/g, ''); + + // Restore align blocks with their & operators intact + alignBlocks.forEach((block, index) => { + content = content.replace(`__ALIGN_BLOCK_${index}__`, block); + }); + + // Convert LaTeX citations to Pandoc format + content = content.replace(/\\cite[tp]?\{([^}]+)\}/g, (match, citations) => { + // Handle multiple citations separated by commas - all become simple @citations + return citations.split(',').map(cite => `@${cite.trim()}`).join(', '); + }); + + // Handle complex \textsc with nested math - extract and simplify (but not in command definitions) + content = content.replace(/\\textsc\{([^{}]*(?:\{[^{}]*\}[^{}]*)*)\}/g, (match, content_inside, offset) => { + // Skip if this is inside a \newcommand or similar definition + const before = content.substring(Math.max(0, offset - 50), offset); + if (before.includes('\\newcommand') || before.includes('\\renewcommand') || before.includes('\\def')) { + return match; // Keep original + } + + // Remove math delimiters inside textsc for simplification + const simplified = content_inside.replace(/\\\([^)]+\\\)/g, 'MATHEXPR'); + return `\\text{${simplified}}`; + }); + + // Remove complex custom commands that pandoc can't handle + content = content.replace(/\\input\{snippets\/[^}]+\}/g, '% Code snippet removed'); + + // Find all \input{} commands (but skip commented ones) + const inputRegex = /^([^%]*?)\\input\{([^}]+)\}/gm; + let match; + + while ((match = inputRegex.exec(content)) !== null) { + const beforeInput = match[1]; + const inputPath = match[2]; + + // Skip if the \input is commented (% appears before \input on the line) + if (beforeInput.includes('%')) { + continue; + } + let fullPath; + + // Skip only problematic files, let Pandoc handle macros + if (inputPath.includes('snippets/')) { + console.log(` Skipping: ${inputPath}`); + content = content.replace(`\\input{${inputPath}}`, `% Skipped: ${inputPath}`); + continue; + } + + // Handle paths with or without .tex extension + if (inputPath.endsWith('.tex')) { + fullPath = join(inputDir, inputPath); + } else { + fullPath = join(inputDir, inputPath + '.tex'); + } + + if (existsSync(fullPath)) { + console.log(` Including: ${inputPath}`); + let includedContent = readFileSync(fullPath, 'utf8'); + + // Clean included content too + includedContent = includedContent.replace(/\$p_0\$/g, 'p0'); + includedContent = includedContent.replace(/\\input\{snippets\/[^}]+\}/g, '% Code snippet removed'); + + // Handle complex \textsc in included content + includedContent = includedContent.replace(/\\textsc\{([^{}]*(?:\{[^{}]*\}[^{}]*)*)\}/g, (match, content_inside, offset) => { + // Skip if this is inside a \newcommand or similar definition + const before = includedContent.substring(Math.max(0, offset - 50), offset); + if (before.includes('\\newcommand') || before.includes('\\renewcommand') || before.includes('\\def')) { + return match; // Keep original + } + + const simplified = content_inside.replace(/\\\([^)]+\\\)/g, 'MATHEXPR'); + return `\\text{${simplified}}`; + }); + + // Apply same align-preserving logic to included content + const alignBlocksIncluded = []; + includedContent = includedContent.replace(/\\begin\{align\}([\s\S]*?)\\end\{align\}/g, (match, alignContent) => { + alignBlocksIncluded.push(match); + return `__ALIGN_BLOCK_${alignBlocksIncluded.length - 1}__`; + }); + + // Remove alignment operators from non-align content in included files + includedContent = includedContent.replace(/&=/g, '='); + includedContent = includedContent.replace(/&/g, ''); + + // Restore align blocks with their & operators intact + alignBlocksIncluded.forEach((block, index) => { + includedContent = includedContent.replace(`__ALIGN_BLOCK_${index}__`, block); + }); + + // Convert math environments in included content + includedContent = includedContent.replace(/\$\$\\begin\{equation\*\}/g, '$$'); + includedContent = includedContent.replace(/\\end\{equation\*\}\$\$/g, '$$'); + includedContent = includedContent.replace(/\\begin\{equation\*\}/g, '$$'); + includedContent = includedContent.replace(/\\end\{equation\*\}/g, '$$'); + + // Convert citations in included content + includedContent = includedContent.replace(/\\cite[tp]?\{([^}]+)\}/g, (match, citations) => { + return citations.split(',').map(cite => `@${cite.trim()}`).join(', '); + }); + + content = content.replace(`\\input{${inputPath}}`, includedContent); + } else { + console.log(` ⚠️ File not found: ${fullPath} (skipping)`); + content = content.replace(`\\input{${inputPath}}`, `% File not found: ${inputPath}`); + } + } + + // Apply reference preprocessing AFTER input inclusion to ensure all references are captured + console.log('🔧 Preprocessing LaTeX references for MDX compatibility...'); + const referenceResult = preprocessLatexReferences(content); + content = referenceResult.content; + + // Write the preprocessed file + writeFileSync(tempFile, content); + return tempFile; +} + +function processBibliography(inputFile, outputDir) { + const bibFile = join(dirname(inputFile), 'main.bib'); + const outputBibFile = join(outputDir, 'main.bib'); + + if (!existsSync(bibFile)) { + console.log(' ⚠️ No bibliography file found'); + return null; + } + + const success = cleanBibliography(bibFile, outputBibFile); + return success ? outputBibFile : null; +} + +export function convertLatexToMarkdown(inputFile, outputDir) { + console.log('🚀 Simple LaTeX to Markdown Converter'); + console.log(`📁 Input: ${inputFile}`); + console.log(`📁 Output: ${outputDir}`); + + // Check if input file exists + if (!existsSync(inputFile)) { + console.error(`❌ Input file not found: ${inputFile}`); + process.exit(1); + } + + // Ensure output directory exists + ensureDirectory(outputDir); + + try { + // Check if pandoc is available + execSync('pandoc --version', { stdio: 'pipe' }); + } catch (error) { + console.error('❌ Pandoc not found. Please install it: brew install pandoc'); + process.exit(1); + } + + // Clean and copy bibliography + const cleanBibFile = processBibliography(inputFile, outputDir); + + // Preprocess the LaTeX file to resolve \input commands + const preprocessedFile = preprocessLatexFile(inputFile, outputDir); + + const inputFileName = basename(inputFile, '.tex'); + const outputFile = join(outputDir, `${inputFileName}.md`); + + try { + console.log('📄 Converting with Pandoc...'); + + // Enhanced pandoc conversion - use tex_math_dollars for KaTeX compatibility + const bibOption = cleanBibFile ? `--bibliography="${cleanBibFile}"` : ''; + + // Use gfm+tex_math_dollars for simple $ delimiters compatible with KaTeX + const mediaDir = join(outputDir, 'assets', 'image'); + ensureDirectory(mediaDir); + const inputDir = dirname(inputFile); + const equationFilterPath = join(__dirname, 'filters', 'equation-ids.lua'); + const pandocCommand = `pandoc "${preprocessedFile}" -f latex+latex_macros -t gfm+tex_math_dollars+raw_html --shift-heading-level-by=1 --wrap=none ${bibOption} --extract-media="${mediaDir}" --resource-path="${inputDir}" --lua-filter="${equationFilterPath}" -o "${outputFile}"`; + + console.log(` Running: ${pandocCommand}`); + execSync(pandocCommand, { stdio: 'pipe' }); + + // Clean up temp file + execSync(`rm "${preprocessedFile}"`, { stdio: 'pipe' }); + + // Post-processing to fix KaTeX incompatible constructions + let markdownContent = readFileSync(outputFile, 'utf8'); + + // Use modular post-processor with code injection + markdownContent = postProcessMarkdown(markdownContent, inputDir); + + writeFileSync(outputFile, markdownContent); + + console.log(`✅ Conversion completed: ${outputFile}`); + + // Show file size + const stats = execSync(`wc -l "${outputFile}"`, { encoding: 'utf8' }); + const lines = stats.trim().split(' ')[0]; + console.log(`📊 Result: ${lines} lines written`); + + } catch (error) { + console.error('❌ Pandoc conversion failed:'); + console.error(error.message); + // Clean up temp file on error + try { + execSync(`rm "${preprocessedFile}"`, { stdio: 'pipe' }); + } catch { } + process.exit(1); + } +} + +function main() { + const config = parseArgs(); + + if (config.clean) { + console.log('🧹 Cleaning output directory...'); + cleanDirectory(config.output); + } + + convertLatexToMarkdown(config.input, config.output); + + console.log('🎉 Simple conversion completed!'); +} + +// Show help if requested +if (process.argv.includes('--help') || process.argv.includes('-h')) { + console.log(` +🚀 Simple LaTeX to Markdown Converter + +Usage: + node scripts/simple-latex-to-markdown.mjs [options] + +Options: + --input=PATH Input LaTeX file (default: latex-converter/input-example/main.tex) + --output=PATH Output directory (default: output/) + --clean Clean output directory before conversion + --help, -h Show this help + +Examples: + # Basic conversion + node scripts/simple-latex-to-markdown.mjs + + # Custom paths + node scripts/simple-latex-to-markdown.mjs --input=my-paper.tex --output=converted/ + + # Clean output first + node scripts/simple-latex-to-markdown.mjs --clean +`); + process.exit(0); +} + +main(); diff --git a/app/scripts/latex-importer/mdx-converter.mjs b/app/scripts/latex-importer/mdx-converter.mjs new file mode 100644 index 0000000000000000000000000000000000000000..5a0deaf79026bc7e6cdf6af59c7c1b61cbea03fb --- /dev/null +++ b/app/scripts/latex-importer/mdx-converter.mjs @@ -0,0 +1,896 @@ +#!/usr/bin/env node + +import { readFileSync, writeFileSync, existsSync } from 'fs'; +import { join, dirname, basename, extname } from 'path'; +import { fileURLToPath } from 'url'; +import { extractAndGenerateFrontmatter } from './metadata-extractor.mjs'; + +const __filename = fileURLToPath(import.meta.url); +const __dirname = dirname(__filename); + +// Configuration +const DEFAULT_INPUT = join(__dirname, 'output', 'main.md'); +const DEFAULT_OUTPUT = join(__dirname, 'output', 'main.mdx'); + +function parseArgs() { + const args = process.argv.slice(2); + const config = { + input: DEFAULT_INPUT, + output: DEFAULT_OUTPUT, + }; + + for (const arg of args) { + if (arg.startsWith('--input=')) { + config.input = arg.substring('--input='.length); + } else if (arg.startsWith('--output=')) { + config.output = arg.substring('--output='.length); + } else if (arg === '--help' || arg === '-h') { + console.log(` +📝 Markdown to MDX Converter + +Usage: + node mdx-converter.mjs [options] + +Options: + --input=PATH Input Markdown file (default: ${DEFAULT_INPUT}) + --output=PATH Output MDX file (default: ${DEFAULT_OUTPUT}) + --help, -h Show this help + +Examples: + # Basic conversion + node mdx-converter.mjs + + # Custom paths + node mdx-converter.mjs --input=article.md --output=article.mdx + `); + process.exit(0); + } else if (!config.input) { + config.input = arg; + } else if (!config.output) { + config.output = arg; + } + } + return config; +} + +/** + * Modular MDX post-processing functions for Astro compatibility + * Each function handles a specific type of transformation + */ + +/** + * Track which Astro components are used during transformations + */ +const usedComponents = new Set(); + +/** + * Track individual image imports needed + */ +const imageImports = new Map(); // src -> varName + +/** + * Add required component imports to the frontmatter + * @param {string} content - MDX content + * @returns {string} - Content with component imports + */ +/** + * Generate a variable name from image path + * @param {string} src - Image source path + * @returns {string} - Valid variable name + */ +function generateImageVarName(src) { + // Extract filename without extension and make it a valid JS variable + const filename = src.split('/').pop().replace(/\.[^.]+$/, ''); + return filename.replace(/[^a-zA-Z0-9]/g, '_').replace(/^[0-9]/, 'img_$&'); +} + +function addComponentImports(content) { + console.log(' 📦 Adding component and image imports...'); + + let imports = []; + + // Add component imports + if (usedComponents.size > 0) { + const componentImports = Array.from(usedComponents) + .map(component => `import ${component} from '../components/${component}.astro';`); + imports.push(...componentImports); + console.log(` ✅ Importing components: ${Array.from(usedComponents).join(', ')}`); + } + + // Add image imports + if (imageImports.size > 0) { + const imageImportStatements = Array.from(imageImports.entries()) + .map(([src, varName]) => `import ${varName} from '${src}';`); + imports.push(...imageImportStatements); + console.log(` ✅ Importing ${imageImports.size} image(s)`); + } + + if (imports.length === 0) { + console.log(' ℹ️ No imports needed'); + return content; + } + + const importBlock = imports.join('\n'); + + // Insert imports after frontmatter + const frontmatterEnd = content.indexOf('---', 3) + 3; + if (frontmatterEnd > 2) { + return content.slice(0, frontmatterEnd) + '\n\n' + importBlock + '\n' + content.slice(frontmatterEnd); + } else { + // No frontmatter, add at beginning + return importBlock + '\n\n' + content; + } +} + + +/** + * Convert grouped figures (subfigures) to MultiFigure components + * @param {string} content - MDX content + * @returns {string} - Content with MultiFigure components for grouped figures + */ +function convertSubfiguresToMultiFigure(content) { + console.log(' 🖼️✨ Converting subfigures to MultiFigure components...'); + + let convertedCount = 0; + + // Pattern to match:
        containing multiple
        elements with a global caption + // This matches the LaTeX subfigure pattern that gets converted by Pandoc + const subfigureGroupPattern = /
        \s*((?:
        [\s\S]*?<\/figure>\s*){2,})
        ([\s\S]*?)<\/figcaption>\s*<\/figure>/g; + + const convertedContent = content.replace(subfigureGroupPattern, (match, figuresMatch, globalCaption) => { + convertedCount++; + + // Extract individual figures within the group + // This pattern is more flexible to handle variations in HTML structure + const individualFigurePattern = /
        \s*]*\/>\s*

        <span id="([^"]*)"[^&]*><\/span><\/p>\s*

        ([\s\S]*?)<\/figcaption>\s*<\/figure>/g; + + const images = []; + let figureMatch; + + while ((figureMatch = individualFigurePattern.exec(figuresMatch)) !== null) { + const [, src, id, caption] = figureMatch; + + // Clean the source path (similar to existing transformImages function) + const cleanSrc = src.replace(/.*\/output\/assets\//, './assets/') + .replace(/\/Users\/[^\/]+\/[^\/]+\/[^\/]+\/[^\/]+\/[^\/]+\/app\/scripts\/latex-to-markdown\/output\/assets\//, './assets/'); + + // Clean caption text (remove HTML, normalize whitespace) + const cleanCaption = caption + .replace(/<[^>]*>/g, '') + .replace(/\n/g, ' ') + .replace(/\s+/g, ' ') + .replace(/'/g, "\\'") + .trim(); + + // Generate alt text from caption + const altText = cleanCaption.length > 100 + ? cleanCaption.substring(0, 100) + '...' + : cleanCaption; + + // Generate variable name for import + const varName = generateImageVarName(cleanSrc); + imageImports.set(cleanSrc, varName); + + images.push({ + src: varName, + alt: altText, + caption: cleanCaption, + id: id + }); + } + + // Clean global caption + const cleanGlobalCaption = globalCaption + .replace(/<[^>]*>/g, '') + .replace(/\n/g, ' ') + .replace(/\s+/g, ' ') + .replace(/'/g, "\\'") + .trim(); + + // Mark MultiFigure component as used + usedComponents.add('MultiFigure'); + + // Determine layout based on number of images + let layout = 'auto'; + if (images.length === 2) layout = '2-column'; + else if (images.length === 3) layout = '3-column'; + else if (images.length === 4) layout = '4-column'; + + // Generate MultiFigure component + const imagesJson = images.map(img => + ` {\n src: ${img.src},\n alt: "${img.alt}",\n caption: "${img.caption}",\n id: "${img.id}"\n }` + ).join(',\n'); + + return ``; + }); + + if (convertedCount > 0) { + console.log(` ✅ Converted ${convertedCount} subfigure group(s) to MultiFigure component(s)`); + } else { + console.log(' ℹ️ No subfigure groups found'); + } + + return convertedContent; +} + +/** + * Transform images to Figure components + * @param {string} content - MDX content + * @returns {string} - Content with Figure components + */ +/** + * Create Figure component with import + * @param {string} src - Clean image source + * @param {string} alt - Alt text + * @param {string} id - Element ID + * @param {string} caption - Figure caption + * @param {string} width - Optional width + * @returns {string} - Figure component markup + */ +function createFigureComponent(src, alt = '', id = '', caption = '', width = '') { + const varName = generateImageVarName(src); + imageImports.set(src, varName); + usedComponents.add('Figure'); + + const props = []; + props.push(`src={${varName}}`); + props.push('zoomable'); + props.push('downloadable'); + if (id) props.push(`id="${id}"`); + props.push('layout="fixed"'); + if (alt) props.push(`alt="${alt}"`); + if (caption) props.push(`caption={'${caption}'}`); + + return ``; +} + +function transformImages(content) { + console.log(' 🖼️ Transforming images to Figure components with imports...'); + + let hasImages = false; + + // Helper function to clean source paths + const cleanSrcPath = (src) => { + return src.replace(/.*\/output\/assets\//, './assets/') + .replace(/\/Users\/[^\/]+\/[^\/]+\/[^\/]+\/[^\/]+\/[^\/]+\/app\/scripts\/latex-to-markdown\/output\/assets\//, './assets/'); + }; + + // Helper to clean caption text + const cleanCaption = (caption) => { + return caption + .replace(/<[^>]*>/g, '') // Remove HTML tags + .replace(/\n/g, ' ') // Replace newlines with spaces + .replace(/\r/g, ' ') // Replace carriage returns with spaces + .replace(/\s+/g, ' ') // Replace multiple spaces with single space + .replace(/'/g, "\\'") // Escape quotes + .trim(); // Trim whitespace + }; + + // Helper to clean alt text + const cleanAltText = (alt, maxLength = 100) => { + const cleaned = alt + .replace(/<[^>]*>/g, '') // Remove HTML tags + .replace(/\n/g, ' ') // Replace newlines with spaces + .replace(/\r/g, ' ') // Replace carriage returns with spaces + .replace(/\s+/g, ' ') // Replace multiple spaces with single space + .trim(); // Trim whitespace + + return cleaned.length > maxLength + ? cleaned.substring(0, maxLength) + '...' + : cleaned; + }; + + // 1. Transform complex HTML figures with style attributes + content = content.replace( + /
        \s*\s*
        \s*(.*?)\s*<\/figcaption>\s*<\/figure>/gs, + (match, id, src, style, caption) => { + const cleanSrc = cleanSrcPath(src); + const cleanCap = cleanCaption(caption); + const altText = cleanAltText(cleanCap); + hasImages = true; + + return createFigureComponent(cleanSrc, altText, id, cleanCap); + } + ); + + // 2. Transform standalone img tags with style + content = content.replace( + //g, + (match, src, style, alt) => { + const cleanSrc = cleanSrcPath(src); + const cleanAlt = cleanAltText(alt || 'Figure'); + hasImages = true; + + return createFigureComponent(cleanSrc, cleanAlt); + } + ); + + // 3. Transform images within wrapfigure divs + content = content.replace( + /
        \s*r[\d.]+\s*]*\/>\s*<\/div>/gs, + (match, src) => { + const cleanSrc = cleanSrcPath(src); + hasImages = true; + + return createFigureComponent(cleanSrc, 'Figure'); + } + ); + + // 4. Transform simple HTML figure/img without style + content = content.replace( + /
        \s*\s*
        \s*(.*?)\s*<\/figcaption>\s*<\/figure>/gs, + (match, id, src, caption) => { + const cleanSrc = cleanSrcPath(src); + const cleanCap = cleanCaption(caption); + const altText = cleanAltText(cleanCap); + hasImages = true; + + return createFigureComponent(cleanSrc, altText, id, cleanCap); + } + ); + + // 5. Clean up figures with minipage divs + content = content.replace( + /
        \s*
        \s*]*\/>\s*<\/div>\s*]*>(.*?)<\/figcaption>\s*<\/figure>/gs, + (match, id, src, caption) => { + const cleanSrc = cleanSrcPath(src); + const cleanCap = cleanCaption(caption); + const altText = cleanAltText(cleanCap); + hasImages = true; + + return createFigureComponent(cleanSrc, altText, id, cleanCap); + } + ); + + // 6. Transform Pandoc-style images: ![alt](src){#id attr="value"} + content = content.replace( + /!\[([^\]]*)\]\(([^)]+)\)(?:\{([^}]+)\})?/g, + (match, alt, src, attributes) => { + const cleanSrc = cleanSrcPath(src); + const cleanAlt = cleanAltText(alt || 'Figure'); + hasImages = true; + + let id = ''; + if (attributes) { + const idMatch = attributes.match(/#([\w-]+)/); + if (idMatch) id = idMatch[1]; + } + + return createFigureComponent(cleanSrc, cleanAlt, id); + } + ); + + if (hasImages) { + console.log(' ✅ Figure components with imports will be created'); + } + + return content; +} + +/** + * Transform HTML spans with style attributes to appropriate components + * @param {string} content - MDX content + * @returns {string} - Content with transformed spans + */ +function transformStyledSpans(content) { + console.log(' 🎨 Transforming styled spans...'); + + // Transform HTML spans with style attributes + content = content.replace( + /(.*?)<\/span>/g, + (match, color, text) => { + // Map colors to semantic classes or components + const colorMap = { + 'hf2': 'text-hf-secondary', + 'hf1': 'text-hf-primary' + }; + + const className = colorMap[color] || `text-${color}`; + return `${text}`; + } + ); + + // Transform markdown spans with style attributes: [text]{style="color: color"} + content = content.replace( + /\[([^\]]+)\]\{style="color: ([^"]+)"\}/g, + (match, text, color) => { + // Map colors to semantic classes or components + const colorMap = { + 'hf2': 'text-hf-secondary', + 'hf1': 'text-hf-primary' + }; + + const className = colorMap[color] || `text-${color}`; + return `${text}`; + } + ); + + return content; +} + +/** + * Transform reference links to proper Astro internal links + * @param {string} content - MDX content + * @returns {string} - Content with transformed links + */ +function fixHtmlEscaping(content) { + console.log(' 🔧 Fixing HTML escaping in spans...'); + + let fixedCount = 0; + + // Pattern 1: \\ + content = content.replace(/\\\\<\/span\\>/g, (match, id, style) => { + fixedCount++; + // Fix common style issues like "position- absolute;" -> "position: absolute;" + const cleanStyle = style.replace('position- absolute;', 'position: absolute;'); + return ``; + }); + + // Pattern 2: \...\ + content = content.replace(/\\([^\\]+)\\<\/span\\>/g, (match, className, text) => { + fixedCount++; + // Remove numbering like (1), (2), (3) from highlight spans + let cleanText = text; + if (className === 'highlight') { + cleanText = text.replace(/^\(\d+\)\s*/, ''); + } + return `${cleanText}`; + }); + + // Pattern 3: HTML-encoded spans in paragraph tags + //

        <span id="..." style="..."></span>

        + content = content.replace(/

        <span id="([^"]*)" style="([^"]*)"><\/span><\/p>/g, (match, id, style) => { + fixedCount++; + // Fix common style issues like "position- absolute;" -> "position: absolute;" + const cleanStyle = style.replace('position- absolute;', 'position: absolute;'); + return ``; + }); + + // Pattern 4: HTML-encoded spans with class in paragraph tags + //

        <span class="...">...</span>

        + content = content.replace(/

        <span class="([^"]*)">([^&]*)<\/span><\/p>/g, (match, className, text) => { + fixedCount++; + // Remove numbering like (1), (2), (3) from highlight spans + let cleanText = text; + if (className === 'highlight') { + cleanText = text.replace(/^\(\d+\)\s*/, ''); + } + return `${cleanText}`; + }); + + if (fixedCount > 0) { + console.log(` ✅ Fixed ${fixedCount} escaped span(s)`); + } + + return content; +} + +function cleanHighlightNumbering(content) { + console.log(' 🔢 Removing numbering from highlight spans...'); + + let cleanedCount = 0; + // Clean numbering from non-escaped highlight spans too + content = content.replace(/(\(\d+\)\s*)([^<]+)<\/span>/g, (match, numbering, text) => { + cleanedCount++; + return `${text}`; + }); + + if (cleanedCount > 0) { + console.log(` ✅ Removed numbering from ${cleanedCount} highlight span(s)`); + } + + return content; +} + +function transformReferenceLinks(content) { + console.log(' 🔗 Transforming reference links...'); + + // Transform Pandoc reference links: [text](#ref){reference-type="ref" reference="ref"} + return content.replace( + /\[([^\]]+)\]\((#[^)]+)\)\{[^}]*reference[^}]*\}/g, + (match, text, href) => { + return `[${text}](${href})`; + } + ); +} + + +/** + * Fix frontmatter and ensure proper MDX format + * @param {string} content - MDX content + * @param {string} latexContent - Original LaTeX content for metadata extraction + * @returns {string} - Content with proper frontmatter + */ +function ensureFrontmatter(content, latexContent = '') { + console.log(' 📄 Ensuring proper frontmatter...'); + + if (!content.startsWith('---')) { + let frontmatter; + + if (latexContent) { + // Extract metadata from LaTeX using dedicated module + frontmatter = extractAndGenerateFrontmatter(latexContent); + console.log(' ✅ Generated frontmatter from LaTeX metadata'); + } else { + // Fallback frontmatter + const currentDate = new Date().toLocaleDateString('en-US', { + year: 'numeric', + month: 'short', + day: '2-digit' + }); + frontmatter = `--- +title: "Research Article" +published: "${currentDate}" +tableOfContentsAutoCollapse: true +--- + +`; + console.log(' ✅ Generated basic frontmatter'); + } + + return frontmatter + content; + } + + return content; +} + +/** + * Fix mixed math delimiters like $`...`$ or `...`$ + * @param {string} content - MDX content + * @returns {string} - Content with fixed math delimiters + */ +function fixMixedMathDelimiters(content) { + console.log(' 🔧 Fixing mixed math delimiters...'); + + let fixedCount = 0; + + // Fix patterns like $`...`$ (mixed delimiters) + content = content.replace(/\$`([^`]*)`\$/g, (match, mathContent) => { + fixedCount++; + return `$${mathContent}$`; + }); + + // Fix patterns like `...`$ (backtick start, dollar end) + content = content.replace(/`([^`]*)`\$/g, (match, mathContent) => { + fixedCount++; + return `$${mathContent}$`; + }); + + // Fix patterns like $`...` (dollar start, backtick end - less common) + content = content.replace(/\$`([^`]*)`(?!\$)/g, (match, mathContent) => { + fixedCount++; + return `$${mathContent}$`; + }); + + if (fixedCount > 0) { + console.log(` ✅ Fixed ${fixedCount} mixed math delimiter(s)`); + } + + return content; +} + +/** + * Clean up orphaned math delimiters and fix mixed content + * @param {string} content - MDX content + * @returns {string} - Content with cleaned math blocks + */ +function cleanOrphanedMathDelimiters(content) { + console.log(' 🧹 Cleaning orphaned math delimiters...'); + console.log(' 🔍 Content length:', content.length, 'chars'); + + let fixedCount = 0; + + // Fix orphaned $$ that are alone on lines (but not part of display math blocks) + // Only remove $$ that appear alone without corresponding closing $$ + content = content.replace(/^\$\$\s*$(?!\s*[\s\S]*?\$\$)/gm, () => { + fixedCount++; + return ''; + }); + + // Fix backticks inside $$....$$ blocks (Pandoc artifact) + const mathMatches = content.match(/\$\$([\s\S]*?)\$\$/g); + console.log(` 🔍 Found ${mathMatches ? mathMatches.length : 0} math blocks`); + + content = content.replace(/\$\$([\s\S]*?)\$\$/g, (match, mathContent) => { + // More aggressive: remove ALL single backticks in math blocks (they shouldn't be there) + let cleanedMath = mathContent; + + // Count backticks before + const backticksBefore = (mathContent.match(/`/g) || []).length; + + if (backticksBefore > 0) { + console.log(` 🔧 Found math block with ${backticksBefore} backtick(s)`); + } + + // Remove all isolated backticks (not in pairs) + cleanedMath = cleanedMath.replace(/`/g, ''); + + const backticksAfter = (cleanedMath.match(/`/g) || []).length; + + if (backticksBefore > 0) { + fixedCount++; + console.log(` 🔧 Removed ${backticksBefore} backtick(s) from math block`); + return `$$${cleanedMath}$$`; + } + return match; + }); + + // Fix escaped align in math blocks: \begin{align} -> \begin{align} + content = content.replace(/\\begin\{align\}/g, (match) => { + fixedCount++; + return '\\begin{align}'; + }); + + content = content.replace(/\\end\{align\}/g, (match) => { + fixedCount++; + return '\\end{align}'; + }); + + // Fix cases where text gets mixed with math blocks + // Pattern: ``` math ... ``` text ``` math + content = content.replace(/``` math\s*\n([\s\S]*?)\n```\s*([^`\n]*?)\s*``` math/g, (match, math1, text, math2) => { + if (text.trim().length > 0 && !text.includes('```')) { + fixedCount++; + return '```' + ' math\n' + math1 + '\n```\n\n' + text.trim() + '\n\n```' + ' math'; + } + return match; + }); + + if (fixedCount > 0) { + console.log(` ✅ Fixed ${fixedCount} orphaned math delimiter(s)`); + } + + return content; +} + +/** + * Clean newlines from single-dollar math blocks ($...$) ONLY + * @param {string} content - MDX content + * @returns {string} - Content with cleaned math blocks + */ +function cleanSingleLineMathNewlines(content) { + console.log(' 🔢 Cleaning newlines in single-dollar math blocks ($...$)...'); + + let cleanedCount = 0; + + // ULTRA STRICT: Only target single dollar blocks ($...$) that contain newlines + // Use dotall flag (s) to match newlines with .*, and ensure we don't match $$ + const cleanedContent = content.replace(/\$(?!\$)([\s\S]*?)\$(?!\$)/g, (match, mathContent) => { + // Only process if the content contains newlines + if (mathContent.includes('\n')) { + cleanedCount++; + + // Remove ALL newlines and carriage returns, normalize whitespace + const cleanedMath = mathContent + .replace(/\n+/g, ' ') // Replace all newlines with spaces + .replace(/\r+/g, ' ') // Replace carriage returns with spaces + .replace(/\s+/g, ' ') // Normalize multiple spaces to single + .trim(); // Remove leading/trailing spaces + + return `$${cleanedMath}$`; + } + return match; // Keep original if no newlines + }); + + if (cleanedCount > 0) { + console.log(` ✅ Cleaned ${cleanedCount} single-dollar math block(s) with newlines`); + } + + return cleanedContent; +} + +/** + * Add proper line breaks around display math blocks ($$...$$) + * @param {string} content - MDX content + * @returns {string} - Content with properly spaced display math + */ +function formatDisplayMathBlocks(content) { + console.log(' 📐 Formatting display math blocks with proper spacing...'); + + let formattedCount = 0; + + // Find all $$...$$$ blocks (display math) and ensure proper line breaks + // Very strict: only matches exactly $$ followed by content followed by $$ + const formattedContent = content.replace(/\$\$([\s\S]*?)\$\$/g, (match, mathContent) => { + formattedCount++; + + // Clean up the math content - trim whitespace but preserve structure + const cleanedMath = mathContent.trim(); + + // Return with proper line breaks before and after + return `\n$$\n${cleanedMath}\n$$\n`; + }); + + if (formattedCount > 0) { + console.log(` ✅ Formatted ${formattedCount} display math block(s) with proper spacing`); + } + + return formattedContent; +} + +/** + * Clean newlines from figcaption content + * @param {string} content - MDX content + * @returns {string} - Content with cleaned figcaptions + */ +function cleanFigcaptionNewlines(content) { + console.log(' 📝 Cleaning newlines in figcaption elements...'); + + let cleanedCount = 0; + + // Find all

        ...
        blocks and remove internal newlines + const cleanedContent = content.replace(/]*)>([\s\S]*?)<\/figcaption>/g, (match, attributes, captionContent) => { + // Only process if the content contains newlines + if (captionContent.includes('\n')) { + cleanedCount++; + + // Remove newlines and normalize whitespace + const cleanedCaption = captionContent + .replace(/\n+/g, ' ') // Replace newlines with spaces + .replace(/\s+/g, ' ') // Normalize multiple spaces + .trim(); // Trim whitespace + + return `${cleanedCaption}
        `; + } + + return match; // Return unchanged if no newlines + }); + + if (cleanedCount > 0) { + console.log(` ✅ Cleaned ${cleanedCount} figcaption element(s)`); + } else { + console.log(` ℹ️ No figcaption elements with newlines found`); + } + + return cleanedContent; +} + +/** + * Remove HTML comments from MDX content + * @param {string} content - MDX content + * @returns {string} - Content without HTML comments + */ +function removeHtmlComments(content) { + console.log(' 🗑️ Removing HTML comments...'); + + let removedCount = 0; + + // Remove all HTML comments + const cleanedContent = content.replace(//g, () => { + removedCount++; + return ''; + }); + + if (removedCount > 0) { + console.log(` ✅ Removed ${removedCount} HTML comment(s)`); + } + + return cleanedContent; +} + +/** + * Clean up MDX-incompatible syntax + * @param {string} content - MDX content + * @returns {string} - Cleaned content + */ +function cleanMdxSyntax(content) { + console.log(' 🧹 Cleaning MDX syntax...'); + + return content + // NOTE: Math delimiter fixing is now handled by fixMixedMathDelimiters() + // Ensure proper spacing around JSX-like constructs + .replace(/>\s*\n<') + // Remove problematic heading attributes - be more specific to avoid matching \begin{align} + .replace(/^(#{1,6}\s+[^{#\n]+)\{[^}]+\}$/gm, '$1') + // Fix escaped quotes in text + .replace(/\\("|')/g, '$1'); +} + +/** + * Main MDX processing function that applies all transformations + * @param {string} content - Raw Markdown content + * @param {string} latexContent - Original LaTeX content for metadata extraction + * @returns {string} - Processed MDX content compatible with Astro + */ +function processMdxContent(content, latexContent = '') { + console.log('🔧 Processing for Astro MDX compatibility...'); + + // Clear previous tracking + usedComponents.clear(); + imageImports.clear(); + + let processedContent = content; + + // Apply each transformation step sequentially + processedContent = ensureFrontmatter(processedContent, latexContent); + processedContent = fixMixedMathDelimiters(processedContent); + + // Debug: check for $$ blocks after fixMixedMathDelimiters + const mathBlocksAfterMixed = (processedContent.match(/\$\$([\s\S]*?)\$\$/g) || []).length; + console.log(` 📊 Math blocks after mixed delimiters fix: ${mathBlocksAfterMixed}`); + + processedContent = cleanOrphanedMathDelimiters(processedContent); + processedContent = cleanSingleLineMathNewlines(processedContent); + processedContent = formatDisplayMathBlocks(processedContent); + processedContent = removeHtmlComments(processedContent); + processedContent = cleanMdxSyntax(processedContent); + processedContent = convertSubfiguresToMultiFigure(processedContent); + processedContent = transformImages(processedContent); + processedContent = transformStyledSpans(processedContent); + processedContent = transformReferenceLinks(processedContent); + processedContent = fixHtmlEscaping(processedContent); + processedContent = cleanHighlightNumbering(processedContent); + processedContent = cleanFigcaptionNewlines(processedContent); + + // Add component imports at the end + processedContent = addComponentImports(processedContent); + + return processedContent; +} + +function convertToMdx(inputFile, outputFile) { + console.log('📝 Modular Markdown to Astro MDX Converter'); + console.log(`📁 Input: ${inputFile}`); + console.log(`📁 Output: ${outputFile}`); + + // Check if input file exists + if (!existsSync(inputFile)) { + console.error(`❌ Input file not found: ${inputFile}`); + process.exit(1); + } + + try { + console.log('🔄 Reading Markdown file...'); + const markdownContent = readFileSync(inputFile, 'utf8'); + + // Try to read original LaTeX file for metadata extraction + let latexContent = ''; + try { + const inputDir = dirname(inputFile); + const latexFile = join(inputDir, '..', 'input', 'main.tex'); + if (existsSync(latexFile)) { + latexContent = readFileSync(latexFile, 'utf8'); + } + } catch (error) { + // Ignore LaTeX reading errors - we'll use fallback frontmatter + } + + // Apply modular MDX processing + const mdxContent = processMdxContent(markdownContent, latexContent); + + console.log('💾 Writing MDX file...'); + writeFileSync(outputFile, mdxContent); + + console.log(`✅ Conversion completed: ${outputFile}`); + + // Show file size + const inputSize = Math.round(markdownContent.length / 1024); + const outputSize = Math.round(mdxContent.length / 1024); + console.log(`📊 Input: ${inputSize}KB → Output: ${outputSize}KB`); + + } catch (error) { + console.error('❌ Conversion failed:'); + console.error(error.message); + process.exit(1); + } +} + +export { convertToMdx }; + +function main() { + const config = parseArgs(); + convertToMdx(config.input, config.output); + console.log('🎉 MDX conversion completed!'); +} + +if (import.meta.url === `file://${process.argv[1]}`) { + main(); +} diff --git a/app/scripts/latex-importer/metadata-extractor.mjs b/app/scripts/latex-importer/metadata-extractor.mjs new file mode 100644 index 0000000000000000000000000000000000000000..14943e71fe86b5b6b60543c05176bb82e9ab617c --- /dev/null +++ b/app/scripts/latex-importer/metadata-extractor.mjs @@ -0,0 +1,170 @@ +/** + * LaTeX Metadata Extractor + * Extracts document metadata from LaTeX files for frontmatter generation + */ + +/** + * Extract metadata from LaTeX content + * @param {string} latexContent - Raw LaTeX content + * @returns {object} - Extracted metadata object + */ +export function extractLatexMetadata(latexContent) { + const metadata = {}; + + // Extract title + const titleMatch = latexContent.match(/\\title\s*\{\s*([^}]+)\s*\}/s); + if (titleMatch) { + metadata.title = titleMatch[1] + .replace(/\n/g, ' ') + .trim(); + } + + // Extract authors with their specific affiliations + const authors = []; + const authorMatches = latexContent.matchAll(/\\authorOne\[[^\]]*\]\{([^}]+)\}/g); + + for (const match of authorMatches) { + const fullAuthorInfo = match[1]; + + // Determine affiliations based on macros present + const affiliations = []; + if (fullAuthorInfo.includes('\\ensps')) { + affiliations.push(1); // École Normale Supérieure + } + if (fullAuthorInfo.includes('\\hf')) { + affiliations.push(2); // Hugging Face + } + + // Clean author name by removing macros + let authorName = fullAuthorInfo + .replace(/\\ensps/g, '') // Remove École macro + .replace(/\\hf/g, '') // Remove Hugging Face macro + .replace(/\s+/g, ' ') // Normalize whitespace + .trim(); + + // Skip empty authors or placeholder entries + if (authorName && authorName !== '...') { + authors.push({ + name: authorName, + affiliations: affiliations.length > 0 ? affiliations : [2] // Default to HF if no macro + }); + } + } + + if (authors.length > 0) { + metadata.authors = authors; + } + + // Extract affiliations - create the two distinct affiliations + metadata.affiliations = [ + { + name: "École Normale Supérieure Paris-Saclay" + }, + { + name: "Hugging Face" + } + ]; + + // Extract date if available (common LaTeX patterns) + const datePatterns = [ + /\\date\s*\{([^}]+)\}/, + /\\newcommand\s*\{\\date\}\s*\{([^}]+)\}/, + ]; + + for (const pattern of datePatterns) { + const dateMatch = latexContent.match(pattern); + if (dateMatch) { + metadata.published = dateMatch[1].trim(); + break; + } + } + + // Fallback to current date if no date found + if (!metadata.published) { + metadata.published = new Date().toLocaleDateString('en-US', { + year: 'numeric', + month: 'short', + day: '2-digit' + }); + } + + return metadata; +} + +/** + * Generate YAML frontmatter from metadata object + * @param {object} metadata - Metadata object + * @returns {string} - YAML frontmatter string + */ +export function generateFrontmatter(metadata) { + let frontmatter = '---\n'; + + // Title + if (metadata.title) { + frontmatter += `title: "${metadata.title}"\n`; + } + + // Authors + if (metadata.authors && metadata.authors.length > 0) { + frontmatter += 'authors:\n'; + metadata.authors.forEach(author => { + frontmatter += ` - name: "${author.name}"\n`; + if (author.url) { + frontmatter += ` url: "${author.url}"\n`; + } + frontmatter += ` affiliations: [${author.affiliations.join(', ')}]\n`; + }); + } + + // Affiliations + if (metadata.affiliations && metadata.affiliations.length > 0) { + frontmatter += 'affiliations:\n'; + metadata.affiliations.forEach((affiliation, index) => { + frontmatter += ` - name: "${affiliation.name}"\n`; + if (affiliation.url) { + frontmatter += ` url: "${affiliation.url}"\n`; + } + }); + } + + // Publication date + if (metadata.published) { + frontmatter += `published: "${metadata.published}"\n`; + } + + // Additional metadata + if (metadata.doi) { + frontmatter += `doi: "${metadata.doi}"\n`; + } + + if (metadata.description) { + frontmatter += `description: "${metadata.description}"\n`; + } + + if (metadata.licence) { + frontmatter += `licence: >\n ${metadata.licence}\n`; + } + + if (metadata.tags && metadata.tags.length > 0) { + frontmatter += 'tags:\n'; + metadata.tags.forEach(tag => { + frontmatter += ` - ${tag}\n`; + }); + } + + // Default Astro configuration + frontmatter += 'tableOfContentsAutoCollapse: true\n'; + frontmatter += '---\n\n'; + + return frontmatter; +} + +/** + * Extract and generate frontmatter from LaTeX content + * @param {string} latexContent - Raw LaTeX content + * @returns {string} - Complete YAML frontmatter + */ +export function extractAndGenerateFrontmatter(latexContent) { + const metadata = extractLatexMetadata(latexContent); + return generateFrontmatter(metadata); +} diff --git a/app/scripts/latex-importer/package-lock.json b/app/scripts/latex-importer/package-lock.json new file mode 100644 index 0000000000000000000000000000000000000000..86ab5949a70a037c497af3022c6e68c7fe8c0e83 Binary files /dev/null and b/app/scripts/latex-importer/package-lock.json differ diff --git a/app/scripts/latex-importer/package.json b/app/scripts/latex-importer/package.json new file mode 100644 index 0000000000000000000000000000000000000000..16850d0a30f47e351a3303970a7ce3a2a881abb5 Binary files /dev/null and b/app/scripts/latex-importer/package.json differ diff --git a/app/scripts/latex-importer/post-processor.mjs b/app/scripts/latex-importer/post-processor.mjs new file mode 100644 index 0000000000000000000000000000000000000000..c108c173957c93412672add2199f978ad73ab73f --- /dev/null +++ b/app/scripts/latex-importer/post-processor.mjs @@ -0,0 +1,439 @@ +#!/usr/bin/env node + +import { readFileSync, writeFileSync, existsSync, readdirSync } from 'fs'; +import { join, dirname } from 'path'; +import { fileURLToPath } from 'url'; + +const __filename = fileURLToPath(import.meta.url); +const __dirname = dirname(__filename); + +/** + * Post-processor for cleaning Markdown content from LaTeX conversion + * Each function handles a specific type of cleanup for maintainability + */ + +/** + * Remove TeX low-level grouping commands that break KaTeX + * @param {string} content - Markdown content + * @returns {string} - Cleaned content + */ +function removeTexGroupingCommands(content) { + console.log(' 🧹 Removing TeX grouping commands...'); + + return content + .replace(/\\mathopen\{\}\\mathclose\\bgroup/g, '') + .replace(/\\aftergroup\\egroup/g, '') + .replace(/\\bgroup/g, '') + .replace(/\\egroup/g, ''); +} + +/** + * Simplify LaTeX delimiter constructions + * @param {string} content - Markdown content + * @returns {string} - Cleaned content + */ +function simplifyLatexDelimiters(content) { + console.log(' 🔧 Simplifying LaTeX delimiters...'); + + return content + .replace(/\\left\[\s*/g, '[') + .replace(/\s*\\right\]/g, ']'); +} + +/** + * Remove orphaned LaTeX labels + * @param {string} content - Markdown content + * @returns {string} - Cleaned content + */ +function removeOrphanedLabels(content) { + console.log(' 🏷️ Removing orphaned labels...'); + + return content + .replace(/^\s*\\label\{[^}]+\}\s*$/gm, '') + .replace(/\\label\{[^}]+\}/g, ''); +} + +/** + * Fix KaTeX-incompatible math commands + * @param {string} content - Markdown content + * @returns {string} - Cleaned content + */ +function fixMathCommands(content) { + console.log(' 📐 Fixing KaTeX-incompatible math commands...'); + + return content + // Replace \hdots with \ldots (KaTeX compatible) + .replace(/\\hdots/g, '\\ldots') + // Add more math command fixes here as needed + .replace(/\\vdots/g, '\\vdots'); // This one should be fine, but kept for consistency +} + +/** + * Convert LaTeX matrix commands to KaTeX-compatible environments + * @param {string} content - Markdown content + * @returns {string} - Content with fixed matrix commands + */ +function fixMatrixCommands(content) { + console.log(' 🔢 Converting matrix commands to KaTeX format...'); + + let fixedCount = 0; + + // Convert \pmatrix{...} to \begin{pmatrix}...\end{pmatrix} + content = content.replace(/\\pmatrix\{([^{}]*(?:\{[^{}]*\}[^{}]*)*)\}/g, (match, matrixContent) => { + fixedCount++; + // Split by \\ for rows, handle nested braces + const rows = matrixContent.split('\\\\').map(row => row.trim()).filter(row => row); + return `\\begin{pmatrix}\n${rows.join(' \\\\\n')}\n\\end{pmatrix}`; + }); + + // Convert \bmatrix{...} to \begin{bmatrix}...\end{bmatrix} + content = content.replace(/\\bmatrix\{([^{}]*(?:\{[^{}]*\}[^{}]*)*)\}/g, (match, matrixContent) => { + fixedCount++; + const rows = matrixContent.split('\\\\').map(row => row.trim()).filter(row => row); + return `\\begin{bmatrix}\n${rows.join(' \\\\\n')}\n\\end{bmatrix}`; + }); + + // Convert \vmatrix{...} to \begin{vmatrix}...\end{vmatrix} + content = content.replace(/\\vmatrix\{([^{}]*(?:\{[^{}]*\}[^{}]*)*)\}/g, (match, matrixContent) => { + fixedCount++; + const rows = matrixContent.split('\\\\').map(row => row.trim()).filter(row => row); + return `\\begin{vmatrix}\n${rows.join(' \\\\\n')}\n\\end{vmatrix}`; + }); + + if (fixedCount > 0) { + console.log(` ✅ Fixed ${fixedCount} matrix command(s)`); + } + + return content; +} + +/** + * Fix Unicode characters that break MDX/JSX parsing + * @param {string} content - Markdown content + * @returns {string} - Cleaned content + */ +function fixUnicodeIssues(content) { + console.log(' 🌐 Fixing Unicode characters for MDX compatibility...'); + + return content + // Replace Unicode middle dot (·) with \cdot in math expressions + .replace(/\$([^$]*?)·([^$]*?)\$/g, (match, before, after) => { + return `$${before}\\cdot${after}$`; + }) + // Replace Unicode middle dot in display math + .replace(/\$\$([^$]*?)·([^$]*?)\$\$/g, (match, before, after) => { + return `$$${before}\\cdot${after}$$`; + }) + // Replace other problematic Unicode characters + .replace(/[""]/g, '"') // Smart quotes to regular quotes + .replace(/['']/g, "'") // Smart apostrophes to regular apostrophes + .replace(/…/g, '...') // Ellipsis to three dots + .replace(/–/g, '-') // En dash to hyphen + .replace(/—/g, '--'); // Em dash to double hyphen +} + +/** + * Fix multiline math expressions for MDX compatibility + * @param {string} content - Markdown content + * @returns {string} - Cleaned content + */ +function fixMultilineMath(content) { + console.log(' 📏 Fixing multiline math expressions for MDX...'); + + return content + // Convert multiline inline math to display math blocks (more precise regex) + // Only match if the content is a self-contained math expression within a single line + .replace(/\$([^$\n]*\\\\[^$\n]*)\$/g, (match, mathContent) => { + // Only convert if it contains actual math operators and line breaks + if (mathContent.includes('\\\\') && /[=+\-*/^_{}]/.test(mathContent)) { + // Remove leading/trailing whitespace and normalize newlines + const cleanedMath = mathContent + .replace(/^\s+|\s+$/g, '') + .replace(/\s*\\\\\s*/g, '\\\\\n '); + return `$$\n${cleanedMath}\n$$`; + } + return match; // Keep original if it doesn't look like multiline math + }) + // Ensure display math blocks are properly separated + .replace(/\$\$\s*\n\s*([^$]+?)\s*\n\s*\$\$/g, (match, mathContent) => { + return `\n$$\n${mathContent.trim()}\n$$\n`; + }); +} + +/** + * Inject code snippets into empty code blocks + * @param {string} content - Markdown content + * @param {string} inputDir - Directory containing the LaTeX source and snippets + * @returns {string} - Content with injected code snippets + */ +function injectCodeSnippets(content, inputDir = null) { + console.log(' 💻 Injecting code snippets...'); + + if (!inputDir) { + console.log(' ⚠️ No input directory provided, skipping code injection'); + return content; + } + + const snippetsDir = join(inputDir, 'snippets'); + + if (!existsSync(snippetsDir)) { + console.log(' ⚠️ Snippets directory not found, skipping code injection'); + return content; + } + + // Get all available snippet files + let availableSnippets = []; + try { + availableSnippets = readdirSync(snippetsDir); + console.log(` 📁 Found ${availableSnippets.length} snippet file(s): ${availableSnippets.join(', ')}`); + } catch (error) { + console.log(` ❌ Error reading snippets directory: ${error.message}`); + return content; + } + + // Find all empty code blocks + const emptyCodeBlockPattern = /```\s*(\w+)\s*\n\s*```/g; + + let processedContent = content; + let injectionCount = 0; + + processedContent = processedContent.replace(emptyCodeBlockPattern, (match, language) => { + // Map language names to file extensions + const extensionMap = { + 'python': 'py', + 'javascript': 'js', + 'typescript': 'ts', + 'bash': 'sh', + 'shell': 'sh' + }; + + const fileExtension = extensionMap[language] || language; + + // Try to find a matching snippet file for this language + const matchingFiles = availableSnippets.filter(file => + file.endsWith(`.${fileExtension}`) + ); + + if (matchingFiles.length === 0) { + console.log(` ⚠️ No ${language} snippet found (looking for .${fileExtension})`); + return match; + } + + // Use the first matching file (could be made smarter with context analysis) + const selectedFile = matchingFiles[0]; + const snippetPath = join(snippetsDir, selectedFile); + + try { + const snippetContent = readFileSync(snippetPath, 'utf8'); + injectionCount++; + console.log(` ✅ Injected: ${selectedFile}`); + return `\`\`\`${language}\n${snippetContent.trim()}\n\`\`\``; + } catch (error) { + console.log(` ❌ Error reading ${selectedFile}: ${error.message}`); + return match; + } + }); + + if (injectionCount > 0) { + console.log(` 📊 Injected ${injectionCount} code snippet(s)`); + } + + return processedContent; +} + +/** + * Fix all attributes that still contain colons (href, data-reference, id) + * @param {string} content - Markdown content + * @returns {string} - Cleaned content + */ +function fixAllAttributes(content) { + console.log(' 🔗 Fixing all attributes with colons...'); + + let fixedCount = 0; + + // Fix href attributes containing colons + content = content.replace(/href="([^"]*):([^"]*)"/g, (match, before, after) => { + fixedCount++; + return `href="${before}-${after}"`; + }); + + // Fix data-reference attributes containing colons + content = content.replace(/data-reference="([^"]*):([^"]*)"/g, (match, before, after) => { + fixedCount++; + return `data-reference="${before}-${after}"`; + }); + + // Fix id attributes containing colons (like in Figure components) + content = content.replace(/id="([^"]*):([^"]*)"/g, (match, before, after) => { + fixedCount++; + return `id="${before}-${after}"`; + }); + + if (fixedCount > 0) { + console.log(` ✅ Fixed ${fixedCount} attribute(s) with colons`); + } + + return content; +} + +/** + * Fix link text content that still contains colons + * @param {string} content - Markdown content + * @returns {string} - Cleaned content + */ +function fixLinkTextContent(content) { + console.log(' 📝 Fixing link text content with colons...'); + + let fixedCount = 0; + + // Fix text content within links that contain references with colons + // Pattern: [text:content] + const cleanedContent = content.replace(/]*)>\[([^:]*):([^\]]*)\]<\/a>/g, (match, attributes, before, after) => { + fixedCount++; + return `[${before}-${after}]`; + }); + + if (fixedCount > 0) { + console.log(` ✅ Fixed ${fixedCount} link text(s) with colons`); + } + + return cleanedContent; +} + +/** + * Convert align anchor markers to proper HTML spans outside math blocks + * @param {string} content - Markdown content + * @returns {string} - Content with converted anchor spans + */ +function convertAlignAnchors(content) { + console.log(' 🏷️ Converting align anchor markers to HTML spans...'); + + let convertedCount = 0; + + // Find and replace align anchor markers with proper spans outside math blocks + content = content.replace(/``` math\n%%ALIGN_ANCHOR_ID\{([^}]+)\}%%\n([\s\S]*?)\n```/g, (match, anchorId, mathContent) => { + convertedCount++; + return `\n\n\`\`\` math\n${mathContent}\n\`\`\``; + }); + + if (convertedCount > 0) { + console.log(` ✅ Converted ${convertedCount} align anchor marker(s) to spans`); + } + + return content; +} + +/** + * Main post-processing function that applies all cleanup steps + * @param {string} content - Raw Markdown content from Pandoc + * @param {string} inputDir - Optional: Directory containing LaTeX source for code injection + * @returns {string} - Cleaned Markdown content + */ +export function postProcessMarkdown(content, inputDir = null) { + console.log('🔧 Post-processing for KaTeX compatibility...'); + + let processedContent = content; + + // Apply each cleanup step sequentially + processedContent = removeTexGroupingCommands(processedContent); + processedContent = simplifyLatexDelimiters(processedContent); + processedContent = removeOrphanedLabels(processedContent); + processedContent = convertAlignAnchors(processedContent); + processedContent = fixMathCommands(processedContent); + processedContent = fixMatrixCommands(processedContent); + processedContent = fixUnicodeIssues(processedContent); + processedContent = fixMultilineMath(processedContent); + processedContent = fixAllAttributes(processedContent); + processedContent = fixLinkTextContent(processedContent); + + // Inject code snippets if input directory is provided + if (inputDir) { + processedContent = injectCodeSnippets(processedContent, inputDir); + } + + return processedContent; +} + +/** + * CLI interface for standalone usage + */ +function parseArgs() { + const args = process.argv.slice(2); + const config = { + input: join(__dirname, 'output', 'main.md'), + output: null, // Will default to input if not specified + verbose: false, + }; + + for (const arg of args) { + if (arg.startsWith('--input=')) { + config.input = arg.substring('--input='.length); + } else if (arg.startsWith('--output=')) { + config.output = arg.substring('--output='.length); + } else if (arg === '--verbose') { + config.verbose = true; + } else if (arg === '--help' || arg === '-h') { + console.log(` +🔧 Markdown Post-Processor + +Usage: + node post-processor.mjs [options] + +Options: + --input=PATH Input Markdown file (default: output/main.md) + --output=PATH Output file (default: overwrites input) + --verbose Verbose output + --help, -h Show this help + +Examples: + # Process main.md in-place + node post-processor.mjs + + # Process with custom paths + node post-processor.mjs --input=raw.md --output=clean.md + `); + process.exit(0); + } + } + + // Default output to input if not specified + if (!config.output) { + config.output = config.input; + } + + return config; +} + +function main() { + const config = parseArgs(); + + console.log('🔧 Markdown Post-Processor'); + console.log(`📁 Input: ${config.input}`); + console.log(`📁 Output: ${config.output}`); + + try { + const content = readFileSync(config.input, 'utf8'); + const processedContent = postProcessMarkdown(content); + + writeFileSync(config.output, processedContent); + + console.log(`✅ Post-processing completed: ${config.output}`); + + // Show stats if verbose + if (config.verbose) { + const originalLines = content.split('\n').length; + const processedLines = processedContent.split('\n').length; + console.log(`📊 Lines: ${originalLines} → ${processedLines}`); + } + + } catch (error) { + console.error('❌ Post-processing failed:'); + console.error(error.message); + process.exit(1); + } +} + +// Run CLI if called directly +if (import.meta.url === `file://${process.argv[1]}`) { + main(); +} diff --git a/app/scripts/latex-importer/reference-preprocessor.mjs b/app/scripts/latex-importer/reference-preprocessor.mjs new file mode 100644 index 0000000000000000000000000000000000000000..a3ae6ec933af1a90778a536a95d6675b7cfb5965 --- /dev/null +++ b/app/scripts/latex-importer/reference-preprocessor.mjs @@ -0,0 +1,239 @@ +#!/usr/bin/env node + +/** + * LaTeX Reference Preprocessor + * + * This module cleans up LaTeX references BEFORE Pandoc conversion to ensure + * consistent, MDX-compatible identifiers throughout the document. + * + * What it does: + * - Removes prefixes from labels: \label{sec:intro} → \label{sec-intro} + * - Updates corresponding refs: \ref{sec:intro} → \ref{sec-intro} + * - Handles all reference types: sec:, fig:, eq:, table:, etc. + * - Maintains consistency between labels and references + */ + +/** + * Extract all references from LaTeX content + * @param {string} content - LaTeX content + * @returns {Object} - Object with labels and refs arrays + */ +function extractReferences(content) { + const references = { + labels: new Set(), + refs: new Set(), + cites: new Set() + }; + + // Find all \label{...} commands + const labelMatches = content.matchAll(/\\label\{([^}]+)\}/g); + for (const match of labelMatches) { + references.labels.add(match[1]); + } + + // Find all \ref{...} commands + const refMatches = content.matchAll(/\\ref\{([^}]+)\}/g); + for (const match of refMatches) { + references.refs.add(match[1]); + } + + // Find all \cite{...} commands (already handled in existing code but included for completeness) + const citeMatches = content.matchAll(/\\cite[tp]?\{([^}]+)\}/g); + for (const match of citeMatches) { + // Handle multiple citations: \cite{ref1,ref2,ref3} + const citations = match[1].split(',').map(cite => cite.trim()); + citations.forEach(cite => references.cites.add(cite)); + } + + return references; +} + +/** + * Create clean identifier mapping + * @param {Object} references - References object from extractReferences + * @returns {Map} - Mapping from original to clean identifiers + */ +function createCleanMapping(references) { + const mapping = new Map(); + + // Create mapping for all unique identifiers + const allIdentifiers = new Set([ + ...references.labels, + ...references.refs + ]); + + for (const id of allIdentifiers) { + // Remove common prefixes and replace colons with dashes + let cleanId = id + .replace(/^(sec|section|ch|chapter|fig|figure|eq|equation|tab|table|lst|listing|app|appendix):/gi, '') + .replace(/:/g, '-') + .replace(/[^a-zA-Z0-9_-]/g, '-') // Replace any other problematic characters + .replace(/-+/g, '-') // Collapse multiple dashes + .replace(/^-|-$/g, ''); // Remove leading/trailing dashes + + // Ensure we don't have empty identifiers + if (!cleanId) { + cleanId = id.replace(/:/g, '-'); + } + + mapping.set(id, cleanId); + } + + return mapping; +} + +/** + * Convert labels to HTML anchor spans for better MDX compatibility + * @param {string} content - LaTeX content + * @param {Map} mapping - Identifier mapping (original -> clean) + * @returns {Object} - Result with content and count of conversions + */ +function convertLabelsToAnchors(content, mapping) { + let processedContent = content; + let anchorsCreated = 0; + + // Replace \label{...} with HTML anchor spans, but SKIP labels inside math environments + for (const [original, clean] of mapping) { + // Skip equation labels (they will be handled by the Lua filter) + if (original.startsWith('eq:')) { + continue; + } + + const labelRegex = new RegExp(`\\\\label\\{${escapeRegex(original)}\\}`, 'g'); + const labelMatches = processedContent.match(labelRegex); + + if (labelMatches) { + // Replace \label{original} with HTML span anchor (invisible but accessible) + processedContent = processedContent.replace(labelRegex, `\n\n\n\n`); + anchorsCreated += labelMatches.length; + } + } + + return { content: processedContent, anchorsCreated }; +} + +/** + * Convert \highlight{...} commands to HTML spans with CSS class + * @param {string} content - LaTeX content + * @returns {Object} - Result with content and count of conversions + */ +function convertHighlightCommands(content) { + let processedContent = content; + let highlightsConverted = 0; + + // Replace \highlight{...} with ... + processedContent = processedContent.replace(/\\highlight\{([^}]+)\}/g, (match, text) => { + highlightsConverted++; + return `${text}`; + }); + + return { content: processedContent, highlightsConverted }; +} + +/** + * Apply mapping to LaTeX content + * @param {string} content - Original LaTeX content + * @param {Map} mapping - Identifier mapping + * @returns {string} - Cleaned LaTeX content + */ +function applyMapping(content, mapping) { + let cleanedContent = content; + let changesCount = 0; + + // First, convert labels to anchor spans + const anchorResult = convertLabelsToAnchors(cleanedContent, mapping); + cleanedContent = anchorResult.content; + const anchorsCreated = anchorResult.anchorsCreated; + + // Convert \highlight{} commands to spans + const highlightResult = convertHighlightCommands(cleanedContent); + cleanedContent = highlightResult.content; + const highlightsConverted = highlightResult.highlightsConverted; + + // Then apply mapping to remaining references and equation labels + for (const [original, clean] of mapping) { + if (original !== clean) { + // Replace \ref{original} with \ref{clean} + const refRegex = new RegExp(`\\\\ref\\{${escapeRegex(original)}\\}`, 'g'); + const refMatches = cleanedContent.match(refRegex); + if (refMatches) { + cleanedContent = cleanedContent.replace(refRegex, `\\ref{${clean}}`); + changesCount += refMatches.length; + } + + // For equation labels, still clean the labels themselves (for the Lua filter) + if (original.startsWith('eq:')) { + const labelRegex = new RegExp(`\\\\label\\{${escapeRegex(original)}\\}`, 'g'); + const labelMatches = cleanedContent.match(labelRegex); + if (labelMatches) { + cleanedContent = cleanedContent.replace(labelRegex, `\\label{${clean}}`); + changesCount += labelMatches.length; + } + } + } + } + + return { + content: cleanedContent, + changesCount: changesCount + anchorsCreated, + highlightsConverted: highlightsConverted + }; +} + +/** + * Escape special regex characters + * @param {string} string - String to escape + * @returns {string} - Escaped string + */ +function escapeRegex(string) { + return string.replace(/[.*+?^${}()|[\]\\]/g, '\\$&'); +} + +/** + * Main preprocessing function + * @param {string} latexContent - Original LaTeX content + * @returns {Object} - Result with cleaned content and statistics + */ +export function preprocessLatexReferences(latexContent) { + console.log('🔧 Preprocessing LaTeX references for MDX compatibility...'); + + // 1. Extract all references + const references = extractReferences(latexContent); + + console.log(` 📊 Found: ${references.labels.size} labels, ${references.refs.size} refs`); + + // 2. Create clean mapping + const mapping = createCleanMapping(references); + + // 3. Apply mapping + const result = applyMapping(latexContent, mapping); + + if (result.changesCount > 0) { + console.log(` ✅ Processed ${result.changesCount} reference(s) and created anchor spans`); + + // Show some examples of changes + let exampleCount = 0; + for (const [original, clean] of mapping) { + if (original !== clean && exampleCount < 3) { + console.log(` ${original} → ${clean} (span + refs)`); + exampleCount++; + } + } + if (mapping.size > 3) { + console.log(` ... and ${mapping.size - 3} more anchor spans created`); + } + } else { + console.log(' ℹ️ No reference cleanup needed'); + } + + if (result.highlightsConverted > 0) { + console.log(` ✨ Converted ${result.highlightsConverted} \\highlight{} command(s) to `); + } + + return { + content: result.content, + changesCount: result.changesCount, + mapping: mapping, + references: references + }; +} diff --git a/app/scripts/notion-importer/.cursorignore b/app/scripts/notion-importer/.cursorignore new file mode 100644 index 0000000000000000000000000000000000000000..2eea525d885d5148108f6f3a9a8613863f783d36 --- /dev/null +++ b/app/scripts/notion-importer/.cursorignore @@ -0,0 +1 @@ +.env \ No newline at end of file diff --git a/app/scripts/notion-importer/README.md b/app/scripts/notion-importer/README.md new file mode 100644 index 0000000000000000000000000000000000000000..998806457a853a6dc93a8bd393df921c0aea5eb4 --- /dev/null +++ b/app/scripts/notion-importer/README.md @@ -0,0 +1,334 @@ +# Notion Importer + +Complete Notion to MDX (Markdown + JSX) importer optimized for Astro with advanced media handling, interactive components, and seamless integration. + +## 🚀 Quick Start + +### Method 1: Using NOTION_PAGE_ID (Recommended) + +```bash +# Install dependencies +npm install + +# Setup environment variables +cp env.example .env +# Edit .env with your Notion token and page ID + +# Complete Notion → MDX conversion (fetches title/slug automatically) +NOTION_TOKEN=secret_xxx NOTION_PAGE_ID=abc123 node index.mjs + +# Or use .env file +node index.mjs +``` + +### Method 2: Using pages.json (Legacy) + +```bash +# Install dependencies +npm install + +# Setup environment variables +cp env.example .env +# Edit .env with your Notion token + +# Configure pages in input/pages.json +# { +# "pages": [ +# { +# "id": "your-page-id", +# "title": "Title", +# "slug": "slug" +# } +# ] +# } + +# Complete Notion → MDX conversion +node index.mjs + +# For step-by-step debugging +node notion-converter.mjs # Notion → Markdown +node mdx-converter.mjs # Markdown → MDX +``` + +## 📁 Structure + +``` +notion-importer/ +├── index.mjs # Complete Notion → MDX pipeline +├── notion-converter.mjs # Notion → Markdown with notion-to-md v4 +├── mdx-converter.mjs # Markdown → MDX with Astro components +├── post-processor.mjs # Markdown post-processing +├── package.json # Dependencies and scripts +├── env.example # Environment variables template +├── static/ # Static files injected at build time +│ ├── frontmatter.mdx # Static frontmatter (overrides all others) +│ └── bibliography.bib # Static bibliography +├── input/ # Configuration +│ └── pages.json # Notion pages to convert +└── output/ # Results + ├── *.md # Intermediate Markdown + ├── *.mdx # Final MDX for Astro + └── media/ # Downloaded media files +``` + +## ✨ Key Features + +### 🎯 **Advanced Media Handling** +- **Local download**: Automatic download of all Notion media (images, files, PDFs) +- **Path transformation**: Smart path conversion for web accessibility +- **Image components**: Automatic conversion to Astro `Image` components with zoom/download +- **Media organization**: Structured media storage by page ID + +### 🧮 **Interactive Components** +- **Callouts → Notes**: Notion callouts converted to Astro `Note` components +- **Enhanced tables**: Tables wrapped in styled containers +- **Code blocks**: Enhanced with copy functionality +- **Automatic imports**: Smart component and image import generation + +### 🎨 **Smart Formatting** +- **Link fixing**: Notion internal links converted to relative links +- **Artifact cleanup**: Removal of Notion-specific formatting artifacts +- **Static frontmatter**: Priority injection of custom frontmatter from `static/frontmatter.mdx` +- **Static bibliography**: Automatic copying of `static/bibliography.bib` +- **Astro compatibility**: Full compatibility with Astro MDX processing + +### 🔧 **Robust Pipeline** +- **Notion preprocessing**: Advanced page configuration and media strategy +- **Post-processing**: Markdown cleanup and optimization +- **MDX conversion**: Final transformation with Astro components +- **Auto-copy**: Automatic copying to Astro content directory + +## 📄 Static Files Configuration + +The importer supports static files for consistent metadata and bibliography: + +### Frontmatter (`static/frontmatter.mdx`) +Create this file to override frontmatter across all conversions: + +```yaml +--- +title: "My Article Title" +subtitle: "Optional subtitle" +description: "Article description for SEO" +authors: + - name: "Jane Doe" + url: "https://example.com" + affiliations: + - "Hugging Face" +tags: + - AI + - Research +doi: "10.1000/182" +tableOfContentsAutoCollapse: true +--- +``` + +This static frontmatter takes **highest priority** over any Notion metadata or existing frontmatter. + +### Bibliography (`static/bibliography.bib`) +Add your BibTeX entries to be copied to `src/content/bibliography.bib`: + +```bibtex +@article{example2024, + title={Example Article}, + author={Doe, Jane and Smith, John}, + journal={Example Journal}, + year={2024} +} +``` + +## 📊 Example Workflow + +```bash +# 1. Configure your Notion pages +# Edit input/pages.json with your page IDs + +# 2. Complete automatic conversion +NOTION_TOKEN=your_token node index.mjs --clean + +# 3. Generated results +ls output/ +# → getting-started.md (Intermediate Markdown) +# → getting-started.mdx (Final MDX for Astro) +# → media/ (downloaded images and files) +``` + +### 📋 Conversion Result + +The pipeline generates MDX files optimized for Astro with: + +```mdx +--- +title: "Getting Started with Notion" +published: "2024-01-15" +tableOfContentsAutoCollapse: true +--- + +import Image from '../components/Image.astro'; +import Note from '../components/Note.astro'; +import gettingStartedImage from './media/getting-started/image1.png'; + +## Introduction + +Here is some content with a callout: + + +This is a converted Notion callout. + + +And an image: + +
        +``` + +## ⚙️ Required Astro Configuration + +To use the generated MDX files, ensure your Astro project has the required components: + +```astro +// src/components/Figure.astro +--- +export interface Props { + src: any; + alt?: string; + caption?: string; + zoomable?: boolean; + downloadable?: boolean; + layout?: string; + id?: string; +} + +const { src, alt, caption, zoomable, downloadable, layout, id } = Astro.props; +--- + +
        + {alt} + {caption &&
        {caption}
        } +
        +``` + +## 🛠️ Prerequisites + +- **Node.js** with ESM support +- **Notion Integration**: Set up an integration in your Notion workspace +- **Notion Token**: Copy the "Internal Integration Token" +- **Shared Pages**: Share the specific Notion page(s) with your integration +- **Astro** to use the generated MDX + +## 🎯 Technical Architecture + +### 4-Stage Pipeline + +1. **Notion Preprocessing** (`notion-converter.mjs`) + - Configuration loading from `pages.json` + - Notion API client initialization + - Media download strategy configuration + +2. **Notion-to-Markdown** (notion-to-md v4) + - Page conversion with `NotionConverter` + - Media downloading with `downloadMediaTo()` + - File export with `DefaultExporter` + +3. **Markdown Post-processing** (`post-processor.mjs`) + - Notion artifact cleanup + - Link fixing and optimization + - Table and code block enhancement + +4. **MDX Conversion** (`mdx-converter.mjs`) + - Component transformation (Figure, Note) + - Automatic import generation + - Frontmatter enhancement + - Astro compatibility optimization + +## 📊 Configuration Options + +### Pages Configuration (`input/pages.json`) + +```json +{ + "pages": [ + { + "id": "your-notion-page-id", + "title": "Page Title", + "slug": "page-slug" + } + ] +} +``` + +### Environment Variables + +Copy `env.example` to `.env` and configure: + +```bash +cp env.example .env +# Edit .env with your actual Notion token +``` + +Required variables: +```bash +NOTION_TOKEN=secret_your_notion_integration_token_here +``` + +### Command Line Options + +```bash +# Full workflow +node index.mjs --clean --token=your_token + +# Notion to Markdown only +node index.mjs --notion-only + +# Markdown to MDX only +node index.mjs --mdx-only + +# Custom paths +node index.mjs --input=my-pages.json --output=converted/ +``` + +## 📊 Conversion Statistics + +For a typical Notion page: +- **Media files** automatically downloaded and organized +- **Callouts** converted to interactive Note components +- **Images** transformed to Figure components with zoom/download +- **Tables** enhanced with proper styling containers +- **Code blocks** enhanced with copy functionality +- **Links** fixed for proper internal navigation + +## ✅ Project Status + +### 🎉 **Complete Features** +- ✅ **Notion → MDX Pipeline**: Full end-to-end functional conversion +- ✅ **Media Management**: Automatic download and path transformation +- ✅ **Component Integration**: Seamless Astro component integration +- ✅ **Smart Formatting**: Intelligent cleanup and optimization +- ✅ **Robustness**: Error handling and graceful degradation +- ✅ **Flexibility**: Modular pipeline with step-by-step options + +### 🚀 **Production Ready** +The toolkit is now **100% operational** for converting Notion pages to MDX/Astro with all advanced features (media handling, component integration, smart formatting). + +## 🔗 Integration with notion-to-md v4 + +This toolkit leverages the powerful [notion-to-md v4](https://notionconvert.com/docs/v4/guides/) library with: + +- **Advanced Media Strategies**: Download, upload, and direct media handling +- **Custom Renderers**: Block transformers and annotation transformers +- **Exporter Plugins**: File, buffer, and stdout output options +- **Database Support**: Full database property and frontmatter transformation +- **Page References**: Smart internal link handling + +## 📚 Additional Resources + +- [notion-to-md v4 Documentation](https://notionconvert.com/docs/v4/guides/) +- [Notion API Documentation](https://developers.notion.com/) +- [Astro MDX Documentation](https://docs.astro.build/en/guides/integrations-guide/mdx/) +- [Media Handling Strategies](https://notionconvert.com/blog/mastering-media-handling-in-notion-to-md-v4-download-upload-and-direct-strategies/) +- [Frontmatter Transformation](https://notionconvert.com/blog/how-to-convert-notion-properties-to-frontmatter-with-notion-to-md-v4/) diff --git a/app/scripts/notion-importer/env.example b/app/scripts/notion-importer/env.example new file mode 100644 index 0000000000000000000000000000000000000000..7b89b420f3d18d11035486c98019d406ab813599 --- /dev/null +++ b/app/scripts/notion-importer/env.example @@ -0,0 +1,2 @@ +NOTION_TOKEN=ntn_xxx +NOTION_PAGE_ID=xxx diff --git a/app/scripts/notion-importer/index.mjs b/app/scripts/notion-importer/index.mjs new file mode 100644 index 0000000000000000000000000000000000000000..a09e81236f88cc9408453e4b394479cc9bd70769 --- /dev/null +++ b/app/scripts/notion-importer/index.mjs @@ -0,0 +1,494 @@ +#!/usr/bin/env node + +import { config } from 'dotenv'; +import { join, dirname, basename } from 'path'; +import { fileURLToPath } from 'url'; +import { copyFileSync, existsSync, mkdirSync, readFileSync, writeFileSync, readdirSync, statSync, unlinkSync } from 'fs'; +import { convertNotionToMarkdown } from './notion-converter.mjs'; +import { convertToMdx } from './mdx-converter.mjs'; +import { Client } from '@notionhq/client'; + +// Load environment variables from .env file (but don't override existing ones) +config({ override: false }); + +const __filename = fileURLToPath(import.meta.url); +const __dirname = dirname(__filename); + +// Default configuration +const DEFAULT_INPUT = join(__dirname, 'input', 'pages.json'); +const DEFAULT_OUTPUT = join(__dirname, 'output'); +const ASTRO_CONTENT_PATH = join(__dirname, '..', '..', 'src', 'content', 'article.mdx'); +const ASTRO_ASSETS_PATH = join(__dirname, '..', '..', 'src', 'content', 'assets', 'image'); +const ASTRO_BIB_PATH = join(__dirname, '..', '..', 'src', 'content', 'bibliography.bib'); +const STATIC_BIB_PATH = join(__dirname, 'static', 'bibliography.bib'); + +function parseArgs() { + const args = process.argv.slice(2); + const config = { + input: DEFAULT_INPUT, + output: DEFAULT_OUTPUT, + clean: false, + notionOnly: false, + mdxOnly: false, + token: process.env.NOTION_TOKEN, + pageId: process.env.NOTION_PAGE_ID + }; + + for (const arg of args) { + if (arg.startsWith('--input=')) { + config.input = arg.split('=')[1]; + } else if (arg.startsWith('--output=')) { + config.output = arg.split('=')[1]; + } else if (arg.startsWith('--token=')) { + config.token = arg.split('=')[1]; + } else if (arg.startsWith('--page-id=')) { + config.pageId = arg.split('=')[1]; + } else if (arg === '--clean') { + config.clean = true; + } else if (arg === '--notion-only') { + config.notionOnly = true; + } else if (arg === '--mdx-only') { + config.mdxOnly = true; + } + } + + return config; +} + +function showHelp() { + console.log(` +🚀 Notion to MDX Toolkit + +Usage: + node index.mjs [options] + +Options: + --input=PATH Input pages configuration file (default: input/pages.json) + --output=PATH Output directory (default: output/) + --token=TOKEN Notion API token (or set NOTION_TOKEN env var) + --clean Clean output directory before processing + --notion-only Only convert Notion to Markdown (skip MDX conversion) + --mdx-only Only convert existing Markdown to MDX + --help, -h Show this help + +Environment Variables: + NOTION_TOKEN Your Notion integration token + +Examples: + # Full conversion workflow + NOTION_TOKEN=your_token node index.mjs --clean + + # Only convert Notion pages to Markdown + node index.mjs --notion-only --token=your_token + + # Only convert existing Markdown to MDX + node index.mjs --mdx-only + + # Custom paths + node index.mjs --input=my-pages.json --output=converted/ --token=your_token + +Configuration File Format (pages.json): +{ + "pages": [ + { + "id": "your-notion-page-id", + "title": "Page Title", + "slug": "page-slug" + } + ] +} + +Workflow: + 1. Notion → Markdown (with media download) + 2. Markdown → MDX (with Astro components) + 3. Copy to Astro content directory +`); +} + +function ensureDirectory(dir) { + if (!existsSync(dir)) { + mkdirSync(dir, { recursive: true }); + } +} + +async function cleanDirectory(dir) { + if (existsSync(dir)) { + const { execSync } = await import('child_process'); + execSync(`rm -rf "${dir}"/*`, { stdio: 'inherit' }); + } +} + +function readPagesConfig(inputFile) { + try { + const content = readFileSync(inputFile, 'utf8'); + return JSON.parse(content); + } catch (error) { + console.error(`❌ Error reading pages config: ${error.message}`); + return { pages: [] }; + } +} + +/** + * Create a temporary pages.json from NOTION_PAGE_ID environment variable + * Extracts title and generates slug from the Notion page + */ +async function createPagesConfigFromEnv(pageId, token, outputPath) { + try { + console.log('🔍 Fetching page info from Notion API...'); + const notion = new Client({ auth: token }); + const page = await notion.pages.retrieve({ page_id: pageId }); + + // Extract title + let title = 'Article'; + if (page.properties.title && page.properties.title.title && page.properties.title.title.length > 0) { + title = page.properties.title.title[0].plain_text; + } else if (page.properties.Name && page.properties.Name.title && page.properties.Name.title.length > 0) { + title = page.properties.Name.title[0].plain_text; + } + + // Generate slug from title + const slug = title + .toLowerCase() + .replace(/[^\w\s-]/g, '') + .replace(/\s+/g, '-') + .replace(/-+/g, '-') + .trim(); + + console.log(` ✅ Found page: "${title}" (slug: ${slug})`); + + // Create pages config + const pagesConfig = { + pages: [{ + id: pageId, + title: title, + slug: slug + }] + }; + + // Write to temporary file + writeFileSync(outputPath, JSON.stringify(pagesConfig, null, 4)); + console.log(` ✅ Created temporary pages config`); + + return pagesConfig; + } catch (error) { + console.error(`❌ Error fetching page from Notion: ${error.message}`); + throw error; + } +} + +/** + * Final cleanup function to remove exclude tags and unused imports + * @param {string} content - MDX content + * @returns {string} - Cleaned content + */ +function cleanupExcludeTagsAndImports(content) { + let cleanedContent = content; + let removedCount = 0; + const removedImageVariables = new Set(); + + // First, extract image variable names from exclude blocks before removing them + const excludeBlocks = cleanedContent.match(/[\s\S]*?<\/exclude>/g) || []; + excludeBlocks.forEach(match => { + const imageMatches = match.match(/src=\{([^}]+)\}/g); + if (imageMatches) { + imageMatches.forEach(imgMatch => { + const varName = imgMatch.match(/src=\{([^}]+)\}/)?.[1]; + if (varName) { + removedImageVariables.add(varName); + } + }); + } + }); + + // Remove tags and everything between them (including multiline) + cleanedContent = cleanedContent.replace(/[\s\S]*?<\/exclude>/g, (match) => { + removedCount++; + return ''; + }); + + // Remove unused image imports that were only used in exclude blocks + if (removedImageVariables.size > 0) { + removedImageVariables.forEach(varName => { + // Check if the variable is still used elsewhere in the content after removing exclude blocks + const remainingUsage = cleanedContent.includes(`{${varName}}`) || cleanedContent.includes(`src={${varName}}`); + + if (!remainingUsage) { + // Remove import lines for unused image variables + // Pattern: import VarName from './assets/image/filename'; + const importPattern = new RegExp(`import\\s+${varName.replace(/[.*+?^${}()|[\]\\]/g, '\\$&')}\\s+from\\s+['"][^'"]+['"];?\\s*`, 'g'); + cleanedContent = cleanedContent.replace(importPattern, ''); + console.log(` 🗑️ Removed unused import: ${varName}`); + } + }); + } + + if (removedCount > 0) { + console.log(` 🧹 Final cleanup: removed ${removedCount} exclude block(s) and ${removedImageVariables.size} unused import(s)`); + } + + // Ensure there's always a blank line after imports before content starts + // Find the last import line and ensure there's a blank line before the next non-empty line + const lines = cleanedContent.split('\n'); + let lastImportIndex = -1; + + // Find the last import line + for (let i = 0; i < lines.length; i++) { + if (lines[i].trim().startsWith('import ') && lines[i].trim().endsWith(';')) { + lastImportIndex = i; + } + } + + // If we found imports, ensure there's a blank line after the last one + if (lastImportIndex >= 0) { + // Find the next non-empty line after the last import + let nextNonEmptyIndex = lastImportIndex + 1; + while (nextNonEmptyIndex < lines.length && lines[nextNonEmptyIndex].trim() === '') { + nextNonEmptyIndex++; + } + + // If there's no blank line between the last import and next content, add one + if (nextNonEmptyIndex > lastImportIndex + 1) { + // There are already blank lines, this is fine + } else { + // No blank line, add one + lines.splice(nextNonEmptyIndex, 0, ''); + } + + cleanedContent = lines.join('\n'); + } + + return cleanedContent; +} + +function copyToAstroContent(outputDir) { + console.log('📋 Copying MDX files to Astro content directory...'); + + try { + // Ensure Astro directories exist + mkdirSync(dirname(ASTRO_CONTENT_PATH), { recursive: true }); + mkdirSync(ASTRO_ASSETS_PATH, { recursive: true }); + + // Copy MDX file + const files = readdirSync(outputDir); + const mdxFiles = files.filter(file => file.endsWith('.mdx')); + if (mdxFiles.length > 0) { + const mdxFile = join(outputDir, mdxFiles[0]); // Take the first MDX file + // Read and write instead of copy to avoid EPERM issues + let mdxContent = readFileSync(mdxFile, 'utf8'); + + // Apply final cleanup to ensure no exclude tags or unused imports remain + mdxContent = cleanupExcludeTagsAndImports(mdxContent); + + writeFileSync(ASTRO_CONTENT_PATH, mdxContent); + console.log(` ✅ Copied and cleaned MDX to ${ASTRO_CONTENT_PATH}`); + } + + // Copy images from both media and external-images directories + const imageExtensions = ['.png', '.jpg', '.jpeg', '.gif', '.svg', '.webp', '.bmp', '.tiff', '.html']; + let totalImageCount = 0; + + function copyImagesRecursively(dir, sourceName) { + if (!existsSync(dir)) return; + + const files = readdirSync(dir); + for (const file of files) { + const filePath = join(dir, file); + const stat = statSync(filePath); + + if (stat.isDirectory()) { + copyImagesRecursively(filePath, sourceName); + } else if (imageExtensions.some(ext => file.toLowerCase().endsWith(ext))) { + const filename = basename(filePath); + const destPath = join(ASTRO_ASSETS_PATH, filename); + + try { + // Validate image by checking file size and basic structure + const stats = statSync(filePath); + if (stats.size === 0) { + console.log(` ⚠️ Skipping empty image: ${filename}`); + return; + } + + // Try to copy and validate the result + copyFileSync(filePath, destPath); + + // Additional validation - check if the copied file has reasonable size + const destStats = statSync(destPath); + if (destStats.size === 0) { + console.log(` ❌ Failed to copy corrupted image: ${filename}`); + // Remove the empty file + try { + unlinkSync(destPath); + } catch (e) { } + return; + } + + console.log(` ✅ Copied ${sourceName}: ${filename} (${destStats.size} bytes)`); + totalImageCount++; + } catch (error) { + console.log(` ❌ Failed to copy ${filename}: ${error.message}`); + } + } + } + } + + // Copy images from media directory (Notion images) + const mediaDir = join(outputDir, 'media'); + copyImagesRecursively(mediaDir, 'Notion image'); + + // Copy images from external-images directory (downloaded external images) + const externalImagesDir = join(outputDir, 'external-images'); + copyImagesRecursively(externalImagesDir, 'external image'); + + if (totalImageCount > 0) { + console.log(` ✅ Copied ${totalImageCount} total image(s) to ${ASTRO_ASSETS_PATH}`); + } + + // Always update image paths and filter problematic references in MDX file + if (existsSync(ASTRO_CONTENT_PATH)) { + const mdxContent = readFileSync(ASTRO_CONTENT_PATH, 'utf8'); + let updatedContent = mdxContent.replace(/\.\/media\//g, './assets/image/'); + // Remove the subdirectory from image paths since we copy images directly to assets/image/ + updatedContent = updatedContent.replace(/\.\/assets\/image\/[^\/]+\//g, './assets/image/'); + + // Check which images actually exist and remove references to missing/corrupted ones + const imageReferences = updatedContent.match(/\.\/assets\/image\/[^\s\)]+/g) || []; + const existingImages = existsSync(ASTRO_ASSETS_PATH) ? readdirSync(ASTRO_ASSETS_PATH).filter(f => + ['.png', '.jpg', '.jpeg', '.gif', '.svg', '.webp', '.bmp', '.tiff'].some(ext => f.toLowerCase().endsWith(ext)) + ) : []; + + for (const imgRef of imageReferences) { + const filename = basename(imgRef); + if (!existingImages.includes(filename)) { + console.log(` ⚠️ Removing reference to missing/corrupted image: ${filename}`); + // Remove the entire image reference (both Image component and markdown syntax) + updatedContent = updatedContent.replace( + new RegExp(`]*src=["']${imgRef.replace(/[.*+?^${}()|[\]\\]/g, '\\$&')}["'][^>]*\/?>`, 'g'), + '' + ); + updatedContent = updatedContent.replace( + new RegExp(`!\\[.*?\\]\\(${imgRef.replace(/[.*+?^${}()|[\]\\]/g, '\\$&')}\\)`, 'g'), + '' + ); + } + } + + writeFileSync(ASTRO_CONTENT_PATH, updatedContent); + console.log(` ✅ Updated image paths and filtered problematic references in MDX file`); + } + + // Copy static bibliography.bib if it exists, otherwise create empty + if (existsSync(STATIC_BIB_PATH)) { + const bibContent = readFileSync(STATIC_BIB_PATH, 'utf8'); + writeFileSync(ASTRO_BIB_PATH, bibContent); + console.log(` ✅ Copied static bibliography from ${STATIC_BIB_PATH}`); + } else { + writeFileSync(ASTRO_BIB_PATH, ''); + console.log(` ✅ Created empty bibliography (no static file found)`); + } + + } catch (error) { + console.warn(` ⚠️ Failed to copy to Astro: ${error.message}`); + } +} + + +async function main() { + const args = process.argv.slice(2); + + if (args.includes('--help') || args.includes('-h')) { + showHelp(); + process.exit(0); + } + + const config = parseArgs(); + + console.log('🚀 Notion to MDX Toolkit'); + console.log('========================'); + + try { + // Prepare input config file + let inputConfigFile = config.input; + let pageIdFromEnv = null; + + // If NOTION_PAGE_ID is provided via env var, create temporary pages.json + if (config.pageId && config.token) { + console.log('✨ Using NOTION_PAGE_ID from environment variable'); + const tempConfigPath = join(config.output, '.temp-pages.json'); + ensureDirectory(config.output); + await createPagesConfigFromEnv(config.pageId, config.token, tempConfigPath); + inputConfigFile = tempConfigPath; + pageIdFromEnv = config.pageId; + } else if (!existsSync(config.input)) { + console.error(`❌ No NOTION_PAGE_ID environment variable and no pages.json found at: ${config.input}`); + console.log('💡 Either set NOTION_PAGE_ID env var or create input/pages.json'); + process.exit(1); + } + + // Always clean output directory to avoid conflicts with previous imports + console.log('🧹 Cleaning output directory to avoid conflicts...'); + await cleanDirectory(config.output); + + // Clean assets/image directory and ensure proper permissions + console.log('🧹 Cleaning assets/image directory and setting permissions...'); + if (existsSync(ASTRO_ASSETS_PATH)) { + await cleanDirectory(ASTRO_ASSETS_PATH); + } else { + ensureDirectory(ASTRO_ASSETS_PATH); + } + + // Ensure proper permissions for assets directory + const { execSync } = await import('child_process'); + try { + execSync(`chmod -R 755 "${ASTRO_ASSETS_PATH}"`, { stdio: 'inherit' }); + console.log(' ✅ Set permissions for assets/image directory'); + } catch (error) { + console.log(' ⚠️ Could not set permissions (non-critical):', error.message); + } + + if (config.mdxOnly) { + // Only convert existing Markdown to MDX + console.log('📝 MDX conversion only mode'); + await convertToMdx(config.output, config.output); + copyToAstroContent(config.output); + + } else if (config.notionOnly) { + // Only convert Notion to Markdown + console.log('📄 Notion conversion only mode'); + await convertNotionToMarkdown(inputConfigFile, config.output, config.token); + + } else { + // Full workflow + console.log('🔄 Full conversion workflow'); + + // Step 1: Convert Notion to Markdown + console.log('\n📄 Step 1: Converting Notion pages to Markdown...'); + await convertNotionToMarkdown(inputConfigFile, config.output, config.token); + + // Step 2: Convert Markdown to MDX with Notion metadata + console.log('\n📝 Step 2: Converting Markdown to MDX...'); + const pagesConfig = readPagesConfig(inputConfigFile); + const firstPage = pagesConfig.pages && pagesConfig.pages.length > 0 ? pagesConfig.pages[0] : null; + const pageId = pageIdFromEnv || (firstPage ? firstPage.id : null); + await convertToMdx(config.output, config.output, pageId, config.token); + + // Step 3: Copy to Astro content directory + console.log('\n📋 Step 3: Copying to Astro content directory...'); + copyToAstroContent(config.output); + } + + console.log('\n🎉 Conversion completed successfully!'); + + } catch (error) { + console.error('❌ Error:', error.message); + process.exit(1); + } +} + +// Export functions for use as module +export { convertNotionToMarkdown, convertToMdx }; + +// Run CLI if called directly +if (import.meta.url === `file://${process.argv[1]}`) { + main(); +} diff --git a/app/scripts/notion-importer/input/pages.json b/app/scripts/notion-importer/input/pages.json new file mode 100644 index 0000000000000000000000000000000000000000..d043e3a1081cb57d6605c813415ae02f847db229 --- /dev/null +++ b/app/scripts/notion-importer/input/pages.json @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:2d51fba4ce9b05562f5df611a150e3cd702b487d2e608441318336556e0f248a +size 188 diff --git a/app/scripts/notion-importer/mdx-converter.mjs b/app/scripts/notion-importer/mdx-converter.mjs new file mode 100644 index 0000000000000000000000000000000000000000..8d6a4e206dfe4bae21217d8c9cdd3c8d91a25583 --- /dev/null +++ b/app/scripts/notion-importer/mdx-converter.mjs @@ -0,0 +1,863 @@ +#!/usr/bin/env node + +import { readFileSync, writeFileSync, existsSync, mkdirSync, readdirSync, statSync } from 'fs'; +import { join, dirname, basename, extname } from 'path'; +import { fileURLToPath } from 'url'; +import matter from 'gray-matter'; +import fetch from 'node-fetch'; + +const __filename = fileURLToPath(import.meta.url); +const __dirname = dirname(__filename); + +// Configuration +const DEFAULT_INPUT = join(__dirname, 'output'); +const DEFAULT_OUTPUT = join(__dirname, 'output'); +const STATIC_FRONTMATTER_PATH = join(__dirname, 'static', 'frontmatter.mdx'); + +function parseArgs() { + const args = process.argv.slice(2); + const config = { + input: DEFAULT_INPUT, + output: DEFAULT_OUTPUT, + }; + + for (const arg of args) { + if (arg.startsWith('--input=')) { + config.input = arg.substring('--input='.length); + } else if (arg.startsWith('--output=')) { + config.output = arg.substring('--output='.length); + } else if (arg === '--help' || arg === '-h') { + console.log(` +📝 Notion Markdown to MDX Converter + +Usage: + node mdx-converter.mjs [options] + +Options: + --input=PATH Input directory or file (default: ${DEFAULT_INPUT}) + --output=PATH Output directory (default: ${DEFAULT_OUTPUT}) + --help, -h Show this help + +Examples: + # Convert all markdown files in output directory + node mdx-converter.mjs + + # Convert specific file + node mdx-converter.mjs --input=article.md --output=converted/ + + # Convert directory + node mdx-converter.mjs --input=markdown-files/ --output=mdx-files/ + `); + process.exit(0); + } else if (!config.input) { + config.input = arg; + } else if (!config.output) { + config.output = arg; + } + } + return config; +} + +/** + * Track which Astro components are used during transformations + */ +const usedComponents = new Set(); + +/** + * Track individual image imports needed + */ +const imageImports = new Map(); // src -> varName + +/** + * Track external images that need to be downloaded + */ +const externalImagesToDownload = new Map(); // url -> localPath + +/** + * Generate a variable name from image path + * @param {string} src - Image source path + * @returns {string} - Valid variable name + */ +function generateImageVarName(src) { + // Extract filename without extension and make it a valid JS variable + const filename = src.split('/').pop().replace(/\.[^.]+$/, ''); + return filename.replace(/[^a-zA-Z0-9]/g, '_').replace(/^[0-9]/, 'img_$&'); +} + +/** + * Check if a URL is an external URL (HTTP/HTTPS) + * @param {string} url - URL to check + * @returns {boolean} - True if it's an external URL + */ +function isExternalImageUrl(url) { + try { + const urlObj = new URL(url); + // Just check if it's HTTP/HTTPS - we'll try to download everything + return urlObj.protocol === 'http:' || urlObj.protocol === 'https:'; + } catch { + return false; + } +} + +/** + * Extract image URL from Twitter/X page + * @param {string} tweetUrl - URL of the tweet + * @returns {Promise} - URL of the image or null if not found + */ +async function extractTwitterImageUrl(tweetUrl) { + try { + const response = await fetch(tweetUrl, { + headers: { + 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36' + } + }); + + if (!response.ok) { + return null; + } + + const html = await response.text(); + + // Try to find image URLs in meta tags (Twitter Card) + const metaImageMatch = html.match(/} - Local path to the downloaded file + */ +async function downloadExternalImage(imageUrl, outputDir) { + try { + console.log(` 🌐 Downloading external URL: ${imageUrl}`); + + // Create output directory if it doesn't exist + if (!existsSync(outputDir)) { + mkdirSync(outputDir, { recursive: true }); + } + + let actualImageUrl = imageUrl; + + // Check if it's a Twitter/X URL + if (imageUrl.includes('twitter.com/') || imageUrl.includes('x.com/')) { + console.log(` 🐦 Detected Twitter/X URL, attempting to extract image...`); + const extractedUrl = await extractTwitterImageUrl(imageUrl); + if (extractedUrl) { + actualImageUrl = extractedUrl; + console.log(` ✅ Extracted image URL: ${extractedUrl}`); + } else { + console.log(` ⚠️ Could not automatically extract image from Twitter/X`); + console.log(` 💡 Manual download required:`); + console.log(` 1. Open ${imageUrl} in your browser`); + console.log(` 2. Right-click on the image and "Save image as..."`); + console.log(` 3. Save it to: app/src/content/assets/image/`); + throw new Error('Twitter/X images require manual download'); + } + } + + // Generate filename from URL + const urlObj = new URL(actualImageUrl); + const pathname = urlObj.pathname; + + // Determine file extension - try to get it from URL, default to jpg + let extension = 'jpg'; + if (pathname.includes('.')) { + const urlExtension = pathname.split('.').pop().toLowerCase(); + if (['jpg', 'jpeg', 'png', 'gif', 'svg', 'webp', 'bmp', 'tiff'].includes(urlExtension)) { + extension = urlExtension; + } + } + + // Generate unique filename + const filename = `external_${Date.now()}_${Math.random().toString(36).substr(2, 9)}.${extension}`; + const localPath = join(outputDir, filename); + + // Try to download the URL + const response = await fetch(actualImageUrl, { + headers: { + 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36' + } + }); + + if (!response.ok) { + throw new Error(`HTTP ${response.status}: ${response.statusText}`); + } + + const buffer = await response.buffer(); + + // Validate that we actually got data + if (buffer.length === 0) { + throw new Error('Empty response'); + } + + // Validate that it's actually an image, not HTML + const contentType = response.headers.get('content-type'); + if (contentType && contentType.includes('text/html')) { + throw new Error('Downloaded content is HTML, not an image'); + } + + // Save to local file + writeFileSync(localPath, buffer); + + console.log(` ✅ Downloaded: ${filename} (${buffer.length} bytes)`); + return localPath; + + } catch (error) { + console.log(` ❌ Failed to download ${imageUrl}: ${error.message}`); + throw error; + } +} + +/** + * Process external images in content and download them + * @param {string} content - Markdown content + * @param {string} outputDir - Directory to save downloaded images + * @returns {Promise} - Content with external images replaced by local paths + */ +async function processExternalImages(content, outputDir) { + console.log(' 🌐 Processing external images...'); + + let processedCount = 0; + let downloadedCount = 0; + + // Find all external image URLs in markdown format: ![alt](url) + const externalImageRegex = /!\[([^\]]*)\]\(([^)]+)\)/g; + let match; + const externalImages = new Map(); // url -> alt text + + // First pass: collect all external image URLs + while ((match = externalImageRegex.exec(content)) !== null) { + const alt = match[1]; + const url = match[2]; + + if (isExternalImageUrl(url)) { + externalImages.set(url, alt); + console.log(` 🔍 Found external image: ${url}`); + } + } + + if (externalImages.size === 0) { + console.log(' ℹ️ No external images found'); + return content; + } + + // Second pass: download images and replace URLs + let processedContent = content; + + for (const [url, alt] of externalImages) { + try { + // Download the image + const localPath = await downloadExternalImage(url, outputDir); + const relativePath = `./assets/image/${basename(localPath)}`; + + // Replace the URL in content + processedContent = processedContent.replace( + new RegExp(`!\\[${alt.replace(/[.*+?^${}()|[\]\\]/g, '\\$&')}\\]\\(${url.replace(/[.*+?^${}()|[\]\\]/g, '\\$&')}\\)`, 'g'), + `![${alt}](${relativePath})` + ); + + downloadedCount++; + processedCount++; + + } catch (error) { + console.log(` ⚠️ Skipping external image due to download failure: ${url}`); + } + } + + if (downloadedCount > 0) { + console.log(` ✅ Downloaded ${downloadedCount} external image(s)`); + } + + return processedContent; +} + +/** + * Detect and track Astro components used in the content + * @param {string} content - MDX content + */ +function detectAstroComponents(content) { + console.log(' 🔍 Detecting Astro components in content...'); + + let detectedCount = 0; + + // Known Astro components that should be auto-imported + const knownComponents = [ + 'HtmlEmbed', 'Image', 'Note', 'Sidenote', 'Wide', 'FullWidth', + 'Accordion', 'Quote', 'Reference', 'Glossary', 'Stack', 'ThemeToggle', + 'RawHtml', 'HfUser' + ]; + + // Find all JSX elements that look like Astro components + // Pattern: + const componentMatches = content.match(/<([A-Z][a-zA-Z0-9]*)\s*[^>]*\/?>/g); + + if (componentMatches) { + for (const match of componentMatches) { + // Extract component name from the JSX element + const componentMatch = match.match(/<([A-Z][a-zA-Z0-9]*)/); + if (componentMatch) { + const componentName = componentMatch[1]; + + // Only track known Astro components (skip HTML elements) + if (knownComponents.includes(componentName) && !usedComponents.has(componentName)) { + usedComponents.add(componentName); + detectedCount++; + console.log(` 📦 Found component: ${componentName}`); + } + } + } + } + + if (detectedCount > 0) { + console.log(` ✅ Detected ${detectedCount} new Astro component(s)`); + } else { + console.log(` ℹ️ No new Astro components detected`); + } +} + +/** + * Add required component imports to the frontmatter + * @param {string} content - MDX content + * @returns {string} - Content with component imports + */ +function addComponentImports(content) { + console.log(' 📦 Adding component and image imports...'); + + let imports = []; + + // Add component imports + if (usedComponents.size > 0) { + const componentImports = Array.from(usedComponents) + .map(component => `import ${component} from '../components/${component}.astro';`); + imports.push(...componentImports); + console.log(` ✅ Importing components: ${Array.from(usedComponents).join(', ')}`); + } + + // Add image imports + if (imageImports.size > 0) { + const imageImportStatements = Array.from(imageImports.entries()) + .map(([src, varName]) => `import ${varName} from '${src}';`); + imports.push(...imageImportStatements); + console.log(` ✅ Importing ${imageImports.size} image(s)`); + } + + if (imports.length === 0) { + console.log(' ℹ️ No imports needed'); + return content; + } + + const importBlock = imports.join('\n'); + + // Insert imports after frontmatter + const frontmatterEnd = content.indexOf('---', 3) + 3; + if (frontmatterEnd > 2) { + return content.slice(0, frontmatterEnd) + '\n\n' + importBlock + '\n\n' + content.slice(frontmatterEnd); + } else { + // No frontmatter, add at beginning + return importBlock + '\n\n' + content; + } +} + + +/** + * Load static frontmatter from file + * @returns {object} - Static frontmatter data + */ +function loadStaticFrontmatter() { + try { + if (existsSync(STATIC_FRONTMATTER_PATH)) { + const staticContent = readFileSync(STATIC_FRONTMATTER_PATH, 'utf8'); + const { data } = matter(staticContent); + console.log(' ✅ Loaded static frontmatter from file'); + return data; + } + console.log(' ℹ️ No static frontmatter file found'); + return {}; + } catch (error) { + console.log(` ⚠️ Failed to load static frontmatter: ${error.message}`); + return {}; + } +} + +/** + * Ensure proper frontmatter for MDX using static file first, then existing data + * @param {string} content - MDX content + * @param {string} pageId - Notion page ID (optional, kept for compatibility but ignored) + * @param {string} notionToken - Notion API token (optional, kept for compatibility but ignored) + * @returns {string} - Content with proper frontmatter + */ +async function ensureFrontmatter(content, pageId = null, notionToken = null) { + console.log(' 📄 Ensuring proper frontmatter...'); + + // Load static frontmatter first (highest priority) + const staticData = loadStaticFrontmatter(); + + if (!content.startsWith('---')) { + // No frontmatter in content, use static + basic defaults + let baseData = { ...staticData }; + + // Add basic defaults for required fields if not in static + if (!baseData.title) baseData.title = 'Article'; + if (!baseData.published) { + baseData.published = new Date().toLocaleDateString('en-US', { + year: 'numeric', + month: 'short', + day: '2-digit' + }); + } + if (baseData.tableOfContentsAutoCollapse === undefined) { + baseData.tableOfContentsAutoCollapse = true; + } + + const frontmatter = matter.stringify('', baseData); + console.log(' ✅ Applied static frontmatter to content without frontmatter'); + return frontmatter + content; + } + + // Parse existing frontmatter and merge with static (static takes priority) + try { + const { data: existingData, content: body } = matter(content); + + // Merge: existing data first, then static data overrides + const mergedData = { ...existingData, ...staticData }; + + // Ensure required fields if still missing after merge + if (!mergedData.title) mergedData.title = 'Article'; + if (!mergedData.published) { + mergedData.published = new Date().toLocaleDateString('en-US', { + year: 'numeric', + month: 'short', + day: '2-digit' + }); + } + if (mergedData.tableOfContentsAutoCollapse === undefined) { + mergedData.tableOfContentsAutoCollapse = true; + } + + const enhancedContent = matter.stringify(body, mergedData); + console.log(' ✅ Merged static and existing frontmatter'); + return enhancedContent; + } catch (error) { + console.log(' ⚠️ Could not parse frontmatter, keeping as is'); + return content; + } +} + +/** + * Generate basic frontmatter + * @returns {string} - Basic frontmatter + */ +function generateBasicFrontmatter() { + const currentDate = new Date().toLocaleDateString('en-US', { + year: 'numeric', + month: 'short', + day: '2-digit' + }); + return `--- +title: "Notion Article" +published: "${currentDate}" +tableOfContentsAutoCollapse: true +--- + +`; +} + + +/** + * Check if a line is a table line + * @param {string} line - Line to check + * @returns {boolean} - True if it's a table line + */ +function isTableLine(line) { + const trimmed = line.trim(); + return trimmed.startsWith('|') && trimmed.endsWith('|'); +} + +/** + * Check if a line is a list item + * @param {string} line - Line to check + * @returns {boolean} - True if it's a list item + */ +function isListItem(line) { + const trimmed = line.trim(); + // Match: * -, + (bullet points) or 1. 2. 3. (numbered lists) + return /^\s*[\*\-\+]\s/.test(trimmed) || /^\s*\d+\.\s/.test(trimmed); +} + +/** + * Add a blank line after each markdown table and list + * @param {string} content - MDX content + * @returns {string} - Content with blank lines after tables and lists + */ +function addBlankLineAfterTablesAndLists(content) { + console.log(' 📋 Adding blank lines after tables and lists...'); + + let addedTableCount = 0; + let addedListCount = 0; + const lines = content.split('\n'); + const result = []; + + for (let i = 0; i < lines.length; i++) { + result.push(lines[i]); + + // Check if current line is the end of a table + if (isTableLine(lines[i])) { + // Look ahead to see if this is the last line of a table + let isLastTableLine = false; + + // Check if next line is empty or doesn't start with | + if (i + 1 >= lines.length || + lines[i + 1].trim() === '' || + !isTableLine(lines[i + 1])) { + + // Look back to find if we're actually inside a table + let tableLineCount = 0; + for (let j = i; j >= 0 && isTableLine(lines[j]); j--) { + tableLineCount++; + } + + // Only add blank line if we found at least 2 table lines (making it a real table) + if (tableLineCount >= 2) { + isLastTableLine = true; + } + } + + if (isLastTableLine) { + addedTableCount++; + result.push(''); // Add blank line + } + } + // Check if current line is the end of a list + else if (isListItem(lines[i])) { + // Look ahead to see if this is the last line of a list + let isLastListItem = false; + + // Check if next line is empty or doesn't start with list marker + if (i + 1 >= lines.length || + lines[i + 1].trim() === '' || + !isListItem(lines[i + 1])) { + isLastListItem = true; + } + + if (isLastListItem) { + addedListCount++; + result.push(''); // Add blank line + } + } + } + + if (addedTableCount > 0 || addedListCount > 0) { + console.log(` ✅ Added blank line after ${addedTableCount} table(s) and ${addedListCount} list(s)`); + } else { + console.log(' ℹ️ No tables or lists found to process'); + } + + return result.join('\n'); +} + +/** + * Transform markdown images to Image components + * @param {string} content - Markdown content + * @returns {string} - Content with Image components + */ +function transformMarkdownImages(content) { + console.log(' 🖼️ Transforming markdown images to Image components...'); + + let transformedCount = 0; + + // Transform markdown images: ![alt](src) -> + content = content.replace(/!\[([^\]]*)\]\(([^)]+)\)/g, (match, alt, src) => { + transformedCount++; + + // Clean up the src path - remove /media/ prefix and use relative path + let cleanSrc = src; + if (src.startsWith('/media/')) { + cleanSrc = src.replace('/media/', './assets/image/'); + } + + // Generate variable name for the image import + const varName = generateImageVarName(cleanSrc); + + // Add to imageImports if not already present + if (!imageImports.has(cleanSrc)) { + imageImports.set(cleanSrc, varName); + } + + // Extract filename for alt text if none provided + const finalAlt = alt || src.split('/').pop().split('.')[0]; + + return ``; + }); + + if (transformedCount > 0) { + console.log(` ✅ Transformed ${transformedCount} markdown image(s) to Image components with imports`); + } else { + console.log(' ℹ️ No markdown images found to transform'); + } + + return content; +} + +/** + * Add proper spacing around Astro components + * @param {string} content - MDX content + * @returns {string} - Content with proper spacing around components + */ +function addSpacingAroundComponents(content) { + console.log(' 📏 Adding spacing around Astro components...'); + + let processedContent = content; + let spacingCount = 0; + + // Known Astro components that should have spacing + const knownComponents = [ + 'HtmlEmbed', 'Image', 'Note', 'Sidenote', 'Wide', 'FullWidth', + 'Accordion', 'Quote', 'Reference', 'Glossary', 'Stack', 'ThemeToggle', + 'RawHtml', 'HfUser', 'Figure' + ]; + + // Process each component type + for (const component of knownComponents) { + // Pattern for components with content: ... + // Process this first to handle the complete component structure + const withContentPattern = new RegExp(`(<${component}[^>]*>)([\\s\\S]*?)(<\\/${component}>)`, 'g'); + processedContent = processedContent.replace(withContentPattern, (match, openTag, content, closeTag) => { + spacingCount++; + // Ensure blank line before opening tag and after closing tag + // Also ensure closing tag is on its own line + const trimmedContent = content.trim(); + return `\n\n${openTag}\n${trimmedContent}\n${closeTag}\n\n`; + }); + + // Pattern for self-closing components: + const selfClosingPattern = new RegExp(`(<${component}[^>]*\\/?>)`, 'g'); + processedContent = processedContent.replace(selfClosingPattern, (match) => { + spacingCount++; + return `\n\n${match}\n\n`; + }); + } + + // Clean up excessive newlines (more than 2 consecutive) + processedContent = processedContent.replace(/\n{3,}/g, '\n\n'); + + if (spacingCount > 0) { + console.log(` ✅ Added spacing around ${spacingCount} component(s)`); + } else { + console.log(' ℹ️ No components found to add spacing around'); + } + + return processedContent; +} + +/** + * Fix smart quotes (curly quotes) and replace them with straight quotes + * @param {string} content - Markdown content + * @returns {string} - Content with fixed quotes + */ +function fixSmartQuotes(content) { + console.log(' ✏️ Fixing smart quotes (curly quotes)...'); + + let fixedCount = 0; + const originalContent = content; + + // Replace opening smart double quotes (\u201C) with straight quotes (") + content = content.replace(/\u201C/g, '"'); + + // Replace closing smart double quotes (\u201D) with straight quotes (") + content = content.replace(/\u201D/g, '"'); + + // Replace opening smart single quotes (\u2018) with straight quotes (') + content = content.replace(/\u2018/g, "'"); + + // Replace closing smart single quotes (\u2019) with straight quotes (') + content = content.replace(/\u2019/g, "'"); + + // Count the number of replacements made + fixedCount = 0; + for (let i = 0; i < originalContent.length; i++) { + const char = originalContent[i]; + if (char === '\u201C' || char === '\u201D' || char === '\u2018' || char === '\u2019') { + fixedCount++; + } + } + + if (fixedCount > 0) { + console.log(` ✅ Fixed ${fixedCount} smart quote(s)`); + } else { + console.log(' ℹ️ No smart quotes found'); + } + + return content; +} + +/** + * Main MDX processing function that applies all transformations + * @param {string} content - Raw Markdown content + * @param {string} pageId - Notion page ID (optional) + * @param {string} notionToken - Notion API token (optional) + * @param {string} outputDir - Output directory for downloaded images (optional) + * @returns {string} - Processed MDX content compatible with Astro + */ +async function processMdxContent(content, pageId = null, notionToken = null, outputDir = null) { + console.log('🔧 Processing for Astro MDX compatibility...'); + + // Clear previous tracking + usedComponents.clear(); + imageImports.clear(); + externalImagesToDownload.clear(); + + let processedContent = content; + + // Fix smart quotes first + processedContent = fixSmartQuotes(processedContent); + + // Process external images first (before other transformations) + if (outputDir) { + // Create a temporary external images directory in the output folder + const externalImagesDir = join(outputDir, 'external-images'); + processedContent = await processExternalImages(processedContent, externalImagesDir); + } + + // Apply essential steps only + processedContent = await ensureFrontmatter(processedContent, pageId, notionToken); + + // Add blank lines after tables and lists + processedContent = addBlankLineAfterTablesAndLists(processedContent); + + // Transform markdown images to Image components + processedContent = transformMarkdownImages(processedContent); + + // Add spacing around Astro components + processedContent = addSpacingAroundComponents(processedContent); + + // Detect Astro components used in the content before adding imports + detectAstroComponents(processedContent); + + // Add component imports at the end + processedContent = addComponentImports(processedContent); + + return processedContent; +} + +/** + * Convert a single markdown file to MDX + * @param {string} inputFile - Input markdown file + * @param {string} outputDir - Output directory + * @param {string} pageId - Notion page ID (optional) + * @param {string} notionToken - Notion API token (optional) + */ +async function convertFileToMdx(inputFile, outputDir, pageId = null, notionToken = null) { + const filename = basename(inputFile, '.md'); + const outputFile = join(outputDir, `${filename}.mdx`); + + console.log(`📝 Converting: ${basename(inputFile)} → ${basename(outputFile)}`); + + try { + const markdownContent = readFileSync(inputFile, 'utf8'); + const mdxContent = await processMdxContent(markdownContent, pageId, notionToken, outputDir); + writeFileSync(outputFile, mdxContent); + + console.log(` ✅ Converted: ${outputFile}`); + + // Show file size + const inputSize = Math.round(markdownContent.length / 1024); + const outputSize = Math.round(mdxContent.length / 1024); + console.log(` 📊 Input: ${inputSize}KB → Output: ${outputSize}KB`); + + } catch (error) { + console.error(` ❌ Failed to convert ${inputFile}: ${error.message}`); + } +} + +/** + * Convert all markdown files in a directory to MDX + * @param {string} inputPath - Input path (file or directory) + * @param {string} outputDir - Output directory + * @param {string} pageId - Notion page ID (optional) + * @param {string} notionToken - Notion API token (optional) + */ +async function convertToMdx(inputPath, outputDir, pageId = null, notionToken = null) { + console.log('📝 Notion Markdown to Astro MDX Converter'); + console.log(`📁 Input: ${inputPath}`); + console.log(`📁 Output: ${outputDir}`); + + // Check if input exists + if (!existsSync(inputPath)) { + console.error(`❌ Input not found: ${inputPath}`); + process.exit(1); + } + + try { + // Ensure output directory exists + if (!existsSync(outputDir)) { + mkdirSync(outputDir, { recursive: true }); + } + + let filesToConvert = []; + + if (statSync(inputPath).isDirectory()) { + // Convert all .md files in directory + const files = readdirSync(inputPath); + filesToConvert = files + .filter(file => file.endsWith('.md') && !file.includes('.raw.md')) + .map(file => join(inputPath, file)); + } else if (inputPath.endsWith('.md')) { + // Convert single file + filesToConvert = [inputPath]; + } else { + console.error('❌ Input must be a .md file or directory containing .md files'); + process.exit(1); + } + + if (filesToConvert.length === 0) { + console.log('ℹ️ No .md files found to convert'); + return; + } + + console.log(`🔄 Found ${filesToConvert.length} file(s) to convert`); + + // Convert each file + for (const file of filesToConvert) { + await convertFileToMdx(file, outputDir, pageId, notionToken); + } + + console.log(`✅ Conversion completed! ${filesToConvert.length} file(s) processed`); + + } catch (error) { + console.error('❌ Conversion failed:', error.message); + process.exit(1); + } +} + +export { convertToMdx }; + +function main() { + const config = parseArgs(); + convertToMdx(config.input, config.output); + console.log('🎉 MDX conversion completed!'); +} + +if (import.meta.url === `file://${process.argv[1]}`) { + main(); +} diff --git a/app/scripts/notion-importer/notion-converter.mjs b/app/scripts/notion-importer/notion-converter.mjs new file mode 100644 index 0000000000000000000000000000000000000000..a8324152b9cfe825c8a14f811af3c958643b5e36 --- /dev/null +++ b/app/scripts/notion-importer/notion-converter.mjs @@ -0,0 +1,266 @@ +#!/usr/bin/env node + +import { config } from 'dotenv'; +import { Client } from '@notionhq/client'; +import { NotionConverter } from 'notion-to-md'; +import { DefaultExporter } from 'notion-to-md/plugins/exporter'; +import { readFileSync, writeFileSync, existsSync, mkdirSync } from 'fs'; +import { join, dirname, basename } from 'path'; +import { fileURLToPath } from 'url'; +import { postProcessMarkdown } from './post-processor.mjs'; + +// Load environment variables from .env file (but don't override existing ones) +config({ override: false }); + +const __filename = fileURLToPath(import.meta.url); +const __dirname = dirname(__filename); + +// Configuration +const DEFAULT_INPUT = join(__dirname, 'input', 'pages.json'); +const DEFAULT_OUTPUT = join(__dirname, 'output'); + +function parseArgs() { + const args = process.argv.slice(2); + const config = { + input: DEFAULT_INPUT, + output: DEFAULT_OUTPUT, + clean: false, + token: process.env.NOTION_TOKEN + }; + + for (const arg of args) { + if (arg.startsWith('--input=')) { + config.input = arg.split('=')[1]; + } else if (arg.startsWith('--output=')) { + config.output = arg.split('=')[1]; + } else if (arg.startsWith('--token=')) { + config.token = arg.split('=')[1]; + } else if (arg === '--clean') { + config.clean = true; + } + } + + return config; +} + +function ensureDirectory(dir) { + if (!existsSync(dir)) { + mkdirSync(dir, { recursive: true }); + } +} + +function loadPagesConfig(configFile) { + if (!existsSync(configFile)) { + console.error(`❌ Configuration file not found: ${configFile}`); + console.log('📝 Create a pages.json file with your Notion page IDs:'); + console.log(` +{ + "pages": [ + { + "id": "your-notion-page-id-1", + "title": "Page Title 1", + "slug": "page-1" + }, + { + "id": "your-notion-page-id-2", + "title": "Page Title 2", + "slug": "page-2" + } + ] +} + `); + process.exit(1); + } + + try { + const config = JSON.parse(readFileSync(configFile, 'utf8')); + return config.pages || []; + } catch (error) { + console.error(`❌ Error reading configuration: ${error.message}`); + process.exit(1); + } +} + +/** + * Convert a single Notion page to Markdown with advanced media handling + * @param {Object} notion - Notion client + * @param {string} pageId - Notion page ID + * @param {string} outputDir - Output directory + * @param {string} pageTitle - Page title for file naming + * @returns {Promise} - Path to generated markdown file + */ +async function convertNotionPage(notion, pageId, outputDir, pageTitle) { + console.log(`📄 Converting Notion page: ${pageTitle} (${pageId})`); + + try { + // Create media directory for this page + const mediaDir = join(outputDir, 'media', pageId); + ensureDirectory(mediaDir); + + // Configure the DefaultExporter to save to a file + const outputFile = join(outputDir, `${pageTitle}.md`); + const exporter = new DefaultExporter({ + outputType: 'file', + outputPath: outputFile, + }); + + // Create the converter with media downloading strategy + const n2m = new NotionConverter(notion) + .withExporter(exporter) + // Download media to local directory with path transformation + .downloadMediaTo({ + outputDir: mediaDir, + // Transform paths to be web-accessible + transformPath: (localPath) => `/media/${pageId}/${basename(localPath)}`, + }); + + // Convert the page + const result = await n2m.convert(pageId); + + console.log(` ✅ Converted to: ${outputFile}`); + console.log(` 📊 Content length: ${result.content.length} characters`); + console.log(` 🖼️ Media saved to: ${mediaDir}`); + + return outputFile; + + } catch (error) { + console.error(` ❌ Failed to convert page ${pageId}: ${error.message}`); + throw error; + } +} + +/** + * Process Notion pages with advanced configuration + * @param {string} inputFile - Path to pages configuration + * @param {string} outputDir - Output directory + * @param {string} notionToken - Notion API token + */ +export async function convertNotionToMarkdown(inputFile, outputDir, notionToken) { + console.log('🚀 Notion to Markdown Converter'); + console.log(`📁 Input: ${inputFile}`); + console.log(`📁 Output: ${outputDir}`); + + // Validate Notion token + if (!notionToken) { + console.error('❌ NOTION_TOKEN not found. Please set it as environment variable or use --token=YOUR_TOKEN'); + process.exit(1); + } + + // Ensure output directory exists + ensureDirectory(outputDir); + + try { + // Initialize Notion client + const notion = new Client({ + auth: notionToken, + }); + + // Load pages configuration + const pages = loadPagesConfig(inputFile); + console.log(`📋 Found ${pages.length} page(s) to convert`); + + const convertedFiles = []; + + // Convert each page + for (const page of pages) { + try { + const outputFile = await convertNotionPage( + notion, + page.id, + outputDir, + page.slug || page.title?.toLowerCase().replace(/\s+/g, '-') || page.id + ); + convertedFiles.push(outputFile); + } catch (error) { + console.error(`❌ Failed to convert page ${page.id}: ${error.message}`); + // Continue with other pages + } + } + + // Post-process all converted files and create one intermediate file + console.log('🔧 Post-processing converted files...'); + for (const file of convertedFiles) { + try { + // Read the raw markdown from notion-to-md + let rawContent = readFileSync(file, 'utf8'); + + // Create intermediate file: raw markdown (from notion-to-md) + const rawFile = file.replace('.md', '.raw.md'); + writeFileSync(rawFile, rawContent); + console.log(` 📄 Created raw markdown: ${basename(rawFile)}`); + + // Apply post-processing with Notion client for page inclusion + let processedContent = await postProcessMarkdown(rawContent, notion, notionToken); + writeFileSync(file, processedContent); + console.log(` ✅ Post-processed: ${basename(file)}`); + } catch (error) { + console.error(` ❌ Failed to post-process ${file}: ${error.message}`); + } + } + + console.log(`✅ Conversion completed! ${convertedFiles.length} file(s) generated`); + + } catch (error) { + console.error('❌ Conversion failed:', error.message); + process.exit(1); + } +} + +function main() { + const config = parseArgs(); + + if (config.clean) { + console.log('🧹 Cleaning output directory...'); + // Clean output directory logic would go here + } + + convertNotionToMarkdown(config.input, config.output, config.token); + console.log('🎉 Notion conversion completed!'); +} + +// Show help if requested +if (process.argv.includes('--help') || process.argv.includes('-h')) { + console.log(` +🚀 Notion to Markdown Converter + +Usage: + node notion-converter.mjs [options] + +Options: + --input=PATH Input pages configuration file (default: input/pages.json) + --output=PATH Output directory (default: output/) + --token=TOKEN Notion API token (or set NOTION_TOKEN env var) + --clean Clean output directory before conversion + --help, -h Show this help + +Environment Variables: + NOTION_TOKEN Your Notion integration token + +Examples: + # Basic conversion with environment token + NOTION_TOKEN=your_token node notion-converter.mjs + + # Custom paths and token + node notion-converter.mjs --input=my-pages.json --output=converted/ --token=your_token + + # Clean output first + node notion-converter.mjs --clean + +Configuration File Format (pages.json): +{ + "pages": [ + { + "id": "your-notion-page-id", + "title": "Page Title", + "slug": "page-slug" + } + ] +} +`); + process.exit(0); +} + +// Run CLI if called directly +if (import.meta.url === `file://${process.argv[1]}`) { + main(); +} diff --git a/app/scripts/notion-importer/package-lock.json b/app/scripts/notion-importer/package-lock.json new file mode 100644 index 0000000000000000000000000000000000000000..690fd5728fff128e19c8881d41e2160ad6ab6efb Binary files /dev/null and b/app/scripts/notion-importer/package-lock.json differ diff --git a/app/scripts/notion-importer/package.json b/app/scripts/notion-importer/package.json new file mode 100644 index 0000000000000000000000000000000000000000..967cf990839e7eee04d275e5a79963e2582678aa Binary files /dev/null and b/app/scripts/notion-importer/package.json differ diff --git a/app/scripts/notion-importer/post-processor.mjs b/app/scripts/notion-importer/post-processor.mjs new file mode 100644 index 0000000000000000000000000000000000000000..e2810a1ef85cf065382c4be97f27de03952ab074 --- /dev/null +++ b/app/scripts/notion-importer/post-processor.mjs @@ -0,0 +1,837 @@ +#!/usr/bin/env node + +import { readFileSync, writeFileSync, existsSync, mkdirSync, unlinkSync } from 'fs'; +import { join, dirname, basename } from 'path'; +import { fileURLToPath } from 'url'; +import { Client } from '@notionhq/client'; +import { NotionConverter } from 'notion-to-md'; +import { DefaultExporter } from 'notion-to-md/plugins/exporter'; + +const __filename = fileURLToPath(import.meta.url); +const __dirname = dirname(__filename); + +/** + * Ensure directory exists + */ +function ensureDirectory(dir) { + if (!existsSync(dir)) { + mkdirSync(dir, { recursive: true }); + } +} + +/** + * Post-process Notion-generated Markdown for better MDX compatibility + * @param {string} content - Raw markdown content from Notion + * @param {Client} notionClient - Notion API client (optional) + * @param {string} notionToken - Notion API token (optional) + * @returns {Promise} - Processed markdown content + */ +export async function postProcessMarkdown(content, notionClient = null, notionToken = null) { + console.log('🔧 Post-processing Notion Markdown for MDX compatibility...'); + + let processedContent = content; + + // Apply each transformation step + processedContent = removeExcludeTags(processedContent); + processedContent = await includeNotionPages(processedContent, notionClient, notionToken); + processedContent = cleanNotionArtifacts(processedContent); + processedContent = fixImageAltTextWithLinks(processedContent); + processedContent = fixNotionLinks(processedContent); + processedContent = fixJsxAttributes(processedContent); + processedContent = optimizeImages(processedContent); + processedContent = shiftHeadingLevels(processedContent); + processedContent = cleanEmptyLines(processedContent); + processedContent = fixCodeBlocks(processedContent); + processedContent = fixCodeBlockEndings(processedContent); + processedContent = unwrapHtmlCodeBlocks(processedContent); + processedContent = fixPlainTextCodeBlocks(processedContent); + processedContent = optimizeTables(processedContent); + + return processedContent; +} + +/** + * Remove tags and their content, plus associated media files + * @param {string} content - Markdown content + * @returns {string} - Content with exclude tags removed and unused imports cleaned + */ +function removeExcludeTags(content) { + console.log(' 🗑️ Removing tags and associated media...'); + + let removedCount = 0; + const removedImageVariables = new Set(); + const mediaFilesToDelete = new Set(); + + // First, extract image variable names and media files from exclude blocks before removing them + const excludeBlocks = content.match(/[\s\S]*?<\/exclude>/g) || []; + excludeBlocks.forEach(match => { + // Extract image variables from JSX components + const imageMatches = match.match(/src=\{([^}]+)\}/g); + if (imageMatches) { + imageMatches.forEach(imgMatch => { + const varName = imgMatch.match(/src=\{([^}]+)\}/)?.[1]; + if (varName) { + removedImageVariables.add(varName); + } + }); + } + + // Extract media file paths from markdown images + const markdownImages = match.match(/!\[[^\]]*\]\(([^)]+)\)/g); + if (markdownImages) { + markdownImages.forEach(imgMatch => { + const src = imgMatch.match(/!\[[^\]]*\]\(([^)]+)\)/)?.[1]; + if (src) { + // Extract filename from path like /media/pageId/filename.png + const filename = basename(src); + if (filename) { + mediaFilesToDelete.add(filename); + } + } + }); + } + }); + + // Remove tags and everything between them (including multiline) + content = content.replace(/[\s\S]*?<\/exclude>/g, (match) => { + removedCount++; + return ''; + }); + + // Delete associated media files + if (mediaFilesToDelete.size > 0) { + console.log(` 🗑️ Found ${mediaFilesToDelete.size} media file(s) to delete from exclude blocks`); + + // Try to find and delete media files in common locations + const possibleMediaDirs = [ + join(__dirname, 'output', 'media'), + join(__dirname, '..', '..', 'src', 'content', 'assets', 'image') + ]; + + mediaFilesToDelete.forEach(filename => { + let deleted = false; + for (const mediaDir of possibleMediaDirs) { + if (existsSync(mediaDir)) { + const filePath = join(mediaDir, filename); + if (existsSync(filePath)) { + try { + unlinkSync(filePath); + console.log(` 🗑️ Deleted media file: ${filename}`); + deleted = true; + break; + } catch (error) { + console.log(` ⚠️ Failed to delete ${filename}: ${error.message}`); + } + } + } + } + if (!deleted) { + console.log(` ℹ️ Media file not found: ${filename}`); + } + }); + } + + // Remove unused image imports that were only used in exclude blocks + if (removedImageVariables.size > 0) { + console.log(` 🖼️ Found ${removedImageVariables.size} unused image import(s) in exclude blocks`); + + removedImageVariables.forEach(varName => { + // Check if the variable is still used elsewhere in the content after removing exclude blocks + const remainingUsage = content.includes(`{${varName}}`) || content.includes(`src={${varName}}`); + + if (!remainingUsage) { + // Remove import lines for unused image variables + // Pattern: import VarName from './assets/image/filename'; + const importPattern = new RegExp(`import\\s+${varName.replace(/[.*+?^${}()|[\]\\]/g, '\\$&')}\\s+from\\s+['"][^'"]+['"];?\\s*`, 'g'); + content = content.replace(importPattern, ''); + console.log(` 🗑️ Removed unused import: ${varName}`); + } + }); + + console.log(` 🧹 Cleaned up unused image imports`); + } + + if (removedCount > 0) { + console.log(` ✅ Removed ${removedCount} tag(s) and their content`); + } else { + console.log(' ℹ️ No tags found'); + } + + return content; +} + +/** + * Replace Notion page links with their actual content + * @param {string} content - Markdown content + * @param {Client} notionClient - Notion API client + * @param {string} notionToken - Notion API token + * @returns {Promise} - Content with page links replaced + */ +async function includeNotionPages(content, notionClient, notionToken) { + console.log(' 📄 Including linked Notion pages...'); + + if (!notionClient || !notionToken) { + console.log(' ℹ️ Skipping page inclusion (no Notion client/token provided)'); + return content; + } + + let includedCount = 0; + let skippedCount = 0; + + // First, identify all exclude blocks to avoid processing links within them + const excludeBlocks = []; + const excludeRegex = /[\s\S]*?<\/exclude>/g; + let excludeMatch; + + while ((excludeMatch = excludeRegex.exec(content)) !== null) { + excludeBlocks.push({ + start: excludeMatch.index, + end: excludeMatch.index + excludeMatch[0].length + }); + } + + // Helper function to check if a position is within an exclude block + const isWithinExcludeBlock = (position) => { + return excludeBlocks.some(block => position >= block.start && position <= block.end); + }; + + // Regex to match links to Notion pages with UUID format + // Pattern: [text](uuid-with-dashes) + const notionPageLinkRegex = /\[([^\]]+)\]\(([0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12})\)/g; + + let processedContent = content; + let match; + + // Find all matches + const matches = []; + while ((match = notionPageLinkRegex.exec(content)) !== null) { + const linkStartPos = match.index; + + // Skip if this link is within an exclude block + if (isWithinExcludeBlock(linkStartPos)) { + console.log(` ⏭️ Skipping page link in exclude block: ${match[1]} (${match[2]})`); + skippedCount++; + continue; + } + + matches.push({ + fullMatch: match[0], + linkText: match[1], + pageId: match[2], + startPos: match.index, + endPos: match.index + match[0].length + }); + } + + // Process matches in reverse order to maintain correct indices + for (let i = matches.length - 1; i >= 0; i--) { + const link = matches[i]; + + try { + console.log(` 🔗 Fetching content for page: ${link.pageId}`); + + // Create media directory for this sub-page + const outputDir = join(__dirname, 'output'); + const mediaDir = join(outputDir, 'media', link.pageId); + ensureDirectory(mediaDir); + + // Configure the DefaultExporter to get content as string + const exporter = new DefaultExporter({ + outputType: 'string', + }); + + // Create the converter with media downloading strategy (same as convertNotionPage) + const converter = new NotionConverter(notionClient) + .withExporter(exporter) + // Download media to local directory with path transformation + .downloadMediaTo({ + outputDir: mediaDir, + // Transform paths to be web-accessible + transformPath: (localPath) => `/media/${link.pageId}/${basename(localPath)}`, + }); + + // Convert the page + const result = await converter.convert(link.pageId); + + console.log(` 🖼️ Media saved to: ${mediaDir}`); + + if (result && result.content) { + // Save raw content as .raw.md file + const rawFileName = `${link.linkText.toLowerCase().replace(/[^a-z0-9]+/g, '-')}-${link.pageId}`; + const rawFilePath = join(outputDir, `${rawFileName}.raw.md`); + + try { + writeFileSync(rawFilePath, result.content); + console.log(` 📄 Saved raw markdown: ${rawFileName}.raw.md`); + } catch (error) { + console.log(` ⚠️ Failed to save raw file: ${error.message}`); + } + + // Clean the content (remove frontmatter, etc.) + let pageContent = result.content; + + // Remove YAML frontmatter if present + pageContent = pageContent.replace(/^---[\s\S]*?---\s*\n/, ''); + + // Remove the first markdown heading (H1, H2, H3, etc.) from the included page + pageContent = pageContent.replace(/^#+ .+\n\n?/, ''); + + // Keep the page content without title + const finalContent = '\n\n' + pageContent.trim() + '\n\n'; + + // Replace the link with the content + processedContent = processedContent.substring(0, link.startPos) + + finalContent + + processedContent.substring(link.endPos); + + includedCount++; + console.log(` ✅ Included page content: ${link.linkText}`); + } else { + console.log(` ⚠️ No content found for page: ${link.pageId}`); + } + } catch (error) { + console.log(` ❌ Failed to fetch page ${link.pageId}: ${error.message}`); + // Keep the original link if we can't fetch the content + } + } + + if (includedCount > 0) { + console.log(` ✅ Included ${includedCount} Notion page(s)`); + } else { + console.log(' ℹ️ No Notion page links found to include'); + } + + if (skippedCount > 0) { + console.log(` ⏭️ Skipped ${skippedCount} page link(s) in exclude blocks`); + } + + return processedContent; +} + +/** + * Clean Notion-specific artifacts and formatting + * @param {string} content - Markdown content + * @returns {string} - Cleaned content + */ +function cleanNotionArtifacts(content) { + console.log(' 🧹 Cleaning Notion artifacts...'); + + let cleanedCount = 0; + + // Remove Notion's internal page references that don't convert well + content = content.replace(/\[([^\]]+)\]\(https:\/\/www\.notion\.so\/[^)]+\)/g, (match, text) => { + cleanedCount++; + return text; // Keep just the text, remove the broken link + }); + + // Clean up Notion's callout blocks that might not render properly + content = content.replace(/^> \*\*([^*]+)\*\*\s*\n/gm, '> **$1**\n\n'); + + // Remove Notion's page dividers that don't have markdown equivalents + content = content.replace(/^---+\s*$/gm, ''); + + // Clean up empty blockquotes + content = content.replace(/^>\s*$/gm, ''); + + // Fix corrupted bold/italic formatting from notion-to-md conversion + // Pattern: ***text*** **** -> ***text*** + content = content.replace(/\*\*\*([^*]+)\*\*\*\s+\*\*\*\*/g, (match, text) => { + cleanedCount++; + return `***${text.trim()}***`; + }); + + // Fix other corrupted asterisk patterns + // Pattern: **text** ** -> **text** + content = content.replace(/\*\*([^*]+)\*\*\s+\*\*/g, (match, text) => { + cleanedCount++; + return `**${text.trim()}**`; + }); + + if (cleanedCount > 0) { + console.log(` ✅ Cleaned ${cleanedCount} Notion artifact(s)`); + } + + return content; +} + +/** + * Fix image alt text that contains markdown links + * notion-to-md v4 sometimes generates: ![alt with [link](url)](image_path) + * This breaks MDX parsing. Clean it to: ![alt with @mention](image_path) + * @param {string} content - Markdown content + * @returns {string} - Content with fixed image alt text + */ +function fixImageAltTextWithLinks(content) { + console.log(' 🖼️ Fixing image alt text with embedded links...'); + + let fixedCount = 0; + + // Pattern: ![text [link](url) more_text](image_path) + // This regex finds images where the alt text contains markdown links + const imageWithLinksPattern = /!\[([^\]]*\[[^\]]+\]\([^)]+\)[^\]]*)\]\(([^)]+)\)/g; + + content = content.replace(imageWithLinksPattern, (match, altText, imagePath) => { + fixedCount++; + + // Remove all markdown links from alt text: [text](url) -> text + const cleanedAlt = altText.replace(/\[([^\]]+)\]\([^)]+\)/g, '$1'); + + // Also clean up any remaining brackets + const finalAlt = cleanedAlt.replace(/[\[\]]/g, ''); + + console.log(` 🔧 Fixed: "${altText.substring(0, 50)}..." -> "${finalAlt.substring(0, 50)}..."`); + + return `![${finalAlt}](${imagePath})`; + }); + + if (fixedCount > 0) { + console.log(` ✅ Fixed ${fixedCount} image(s) with embedded links in alt text`); + } else { + console.log(' ℹ️ No images with embedded links found'); + } + + return content; +} + +/** + * Fix Notion internal links to be more MDX-friendly + * @param {string} content - Markdown content + * @returns {string} - Content with fixed links + */ +function fixNotionLinks(content) { + console.log(' 🔗 Fixing Notion internal links...'); + + let fixedCount = 0; + + // Convert Notion page links to relative links (assuming they'll be converted to MDX) + content = content.replace(/\[([^\]]+)\]\(https:\/\/www\.notion\.so\/[^/]+\/([^?#)]+)\)/g, (match, text, pageId) => { + fixedCount++; + // Convert to relative link - this will need to be updated based on your routing + return `[${text}](#${pageId})`; + }); + + // Fix broken notion.so links that might be malformed + content = content.replace(/\[([^\]]+)\]\(https:\/\/www\.notion\.so\/[^)]*\)/g, (match, text) => { + fixedCount++; + return text; // Remove broken links, keep text + }); + + if (fixedCount > 0) { + console.log(` ✅ Fixed ${fixedCount} Notion link(s)`); + } + + return content; +} + +/** + * Fix JSX attributes that were corrupted during Notion conversion + * @param {string} content - Markdown content + * @returns {string} - Content with fixed JSX attributes + */ +function fixJsxAttributes(content) { + console.log(' 🔧 Fixing JSX attributes corrupted by Notion conversion...'); + + let fixedCount = 0; + + // Fix the specific issue: + // Pattern: + content = content.replace(/<(\w+)\s+\*\s*([^*\s]+)\s*\*\s*=\s*"([^"]*)"\s*\/?>/g, (match, tagName, attribute, value) => { + fixedCount++; + return `<${tagName} ${attribute}="${value}" />`; + }); + + // Pattern: + content = content.replace(/<(\w+)\s+\*\s*([^*\s]+)\s*\*\s*=\s*([^>\s\/]+)\s*\/?>/g, (match, tagName, attribute, value) => { + fixedCount++; + return `<${tagName} ${attribute}=${value} />`; + }); + + // Handle cases with **double asterisks** around attribute names + content = content.replace(/<(\w+)\s+\*\*\s*([^*\s]+)\s*\*\*\s*=\s*"([^"]*)"\s*\/?>/g, (match, tagName, attribute, value) => { + fixedCount++; + return `<${tagName} ${attribute}="${value}" />`; + }); + + content = content.replace(/<(\w+)\s+\*\*\s*([^*\s]+)\s*\*\*\s*=\s*([^>\s\/]+)\s*\/?>/g, (match, tagName, attribute, value) => { + fixedCount++; + return `<${tagName} ${attribute}=${value} />`; + }); + + // Fix HTML tags (like iframe, video, etc.) where URLs were corrupted by markdown conversion + // Pattern: src="[url](url)" -> src="url" + // Handle both regular quotes and various smart quote characters (", ", ', ', """, etc.) + // Handle attributes before and after src + + // Handle iframe tags with separate opening and closing tags FIRST: + content = content.replace(/]*?)\ssrc=[""''""\u201C\u201D\u2018\u2019]\[([^\]]+)\]\([^)]+\)[""''""\u201C\u201D\u2018\u2019]([^>]*?)>\s*<\/iframe>/gi, (match, before, urlText, after) => { + fixedCount++; + return ``; + }); + + // Handle self-closing iframe tags SECOND: + +#### Task and metrics + +You want to check what metrics are used: are they automatic, functional, or using a model judge? The answer will change the cost of running evaluations for you, as well as the reproducibility and bias type. Best (but rarest) metrics are functional or based on rule based verifiers When doing code evals, beware of too easy pass/fail unit tests! Recent LLMs have become very good at overwriting globals to 'cheat', especially in languages like Python where you can mess up variable scope. + +### So, you can't reproduce reported model scores? + + + +### Selecting good benchmarks automatically for model training + + + + + +## Creating your own evaluation + +At this stage, you likely have a good idea of why people do evaluation, which benchmarks exist and are relevant for different model stages (training, inference of base and tuned models), but what if nothing exists for your specific use case? + +This is precisely when you could want to create your own evaluation. + + + +## Conclusion + +Evaluation is both an art and a science. We've explored the landscape of LLM evaluation in 2025—from understanding why we evaluate models and the fundamental mechanics of tokenization and inference, to navigating the ever-evolving ecosystem of benchmarks, and finally to creating evaluations for your own use-cases. + +Key things I hope you'll remember are: + +**Think critically about what you're measuring.** Evaluations are proxies for capabilities, so a high score on a benchmark doesn't guarantee real-world performance. Different evaluation approaches (automatic metrics, human judges, or model judges) each come with their own biases, limitations, and tradeoffs. + +**Match your evaluation to your goal.** Are you running ablations during training? Use fast, reliable benchmarks with strong signal even on small models. Comparing final models for selection? Focus on harder, uncontaminated datasets that test holistic capabilities. Building for a specific use case? Create custom evaluations that reflect your problems and data. + +**Reproducibility requires attention to detail.** Small differences in prompts, tokenization, normalization, templates, or random seeds can swing scores by several points. When reporting results, be transparent about your methodology. When trying to reproduce results, expect that exact replication will be extremely challenging even if you attempt to control for every variable. + +**Prefer interpretable evaluation methods.** When possible, functional testing and rule-based verifiers should be chosen over model judges. Evaluations that can be understood and debugged will provide clearer and more actionable insights... and the more interpretable your evaluation, the more you can improve your models! + +**Evaluation is never finished.** As models improve, benchmarks saturate. As training data grows, contamination becomes more likely. As use cases evolve, new capabilities need measuring. Evaluation is an ongoing battle! + +To conclude: The models we build are only as good as our ability to measure what matters. Thanks for reading! + + +### Acknowledgments + +Many thanks to all the people who contributed directly or indirectly to this document, notably Hynek Kydlicek, Loubna Ben Allal, Sander Land and Nathan Habib. \ No newline at end of file diff --git a/app/src/content/assets/audio/audio-example.mp3 b/app/src/content/assets/audio/audio-example.mp3 new file mode 100644 index 0000000000000000000000000000000000000000..5752af5fed84a92208f503c17fd418b67dd74da6 --- /dev/null +++ b/app/src/content/assets/audio/audio-example.mp3 @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:25334bdbaf5980acb854078acdbeb9f413f2ff71be3874e77fb5cd175403d2c9 +size 146330 diff --git a/app/src/content/assets/data/admin_seq_write.json b/app/src/content/assets/data/admin_seq_write.json new file mode 100644 index 0000000000000000000000000000000000000000..0ddb3743c54287f6c4f75289dc9ef1c06aa07036 --- /dev/null +++ b/app/src/content/assets/data/admin_seq_write.json @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:8933d6923884634cc5fcf2d0000de55c43badc3e80de776bff230f5c01af439f +size 8290 diff --git a/app/src/content/assets/data/against_baselines copy.csv b/app/src/content/assets/data/against_baselines copy.csv new file mode 100644 index 0000000000000000000000000000000000000000..d2bbd2200fa92a4b0f34c47b019e92462670cd0a --- /dev/null +++ b/app/src/content/assets/data/against_baselines copy.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:a5e6173a1541b9798278da1729f1e357c0711d2e270f68aa4af8eae962f146dd +size 53573 diff --git a/app/src/content/assets/data/against_baselines.csv b/app/src/content/assets/data/against_baselines.csv new file mode 100644 index 0000000000000000000000000000000000000000..d2bbd2200fa92a4b0f34c47b019e92462670cd0a --- /dev/null +++ b/app/src/content/assets/data/against_baselines.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:a5e6173a1541b9798278da1729f1e357c0711d2e270f68aa4af8eae962f146dd +size 53573 diff --git a/app/src/content/assets/data/against_baselines_deduplicated.csv b/app/src/content/assets/data/against_baselines_deduplicated.csv new file mode 100644 index 0000000000000000000000000000000000000000..e180b302141004f1b9a0fe4a565d45dd3ca3f102 --- /dev/null +++ b/app/src/content/assets/data/against_baselines_deduplicated.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:56d18f581eff719023eb87c695e0e11770738d7872c8b9dac9bc23d9b0ef560b +size 32738 diff --git a/app/src/content/assets/data/all_ratings_luis.csv b/app/src/content/assets/data/all_ratings_luis.csv new file mode 100644 index 0000000000000000000000000000000000000000..249dacd118d20c655c64bf3d7c3dbd203eeb9477 --- /dev/null +++ b/app/src/content/assets/data/all_ratings_luis.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:104433529e7d9c8a3bd297be1138e9e87677a666953d1362c517ec389c6c9172 +size 64966 diff --git a/app/src/content/assets/data/attention_evals.csv b/app/src/content/assets/data/attention_evals.csv new file mode 100644 index 0000000000000000000000000000000000000000..fb97ea0dca30864febb5b9d596d90043f0fa7f08 --- /dev/null +++ b/app/src/content/assets/data/attention_evals.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:26c1151c2f83c4894dfc2e896a6da6539797e9e6bf40939a2bc49ed773f8e723 +size 70840 diff --git a/app/src/content/assets/data/attention_loss.csv b/app/src/content/assets/data/attention_loss.csv new file mode 100644 index 0000000000000000000000000000000000000000..57b576521d40b62a3414f24ffa75212be98616be --- /dev/null +++ b/app/src/content/assets/data/attention_loss.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:46dafcb1fb0e9c45fe0ec48a7d4283c3f925a922ddf560fe32969abfa02b95c9 +size 135780 diff --git a/app/src/content/assets/data/banner_visualisation_data.csv b/app/src/content/assets/data/banner_visualisation_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..3a7e33d02407b7c9cd2cd29f25b18d5f853641ff --- /dev/null +++ b/app/src/content/assets/data/banner_visualisation_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:b155d8c319b1788befe716017fecca580768157feee6221f3af44b7bb9f9c7e5 +size 81995 diff --git a/app/src/content/assets/data/banner_visualisation_data_enriched.csv b/app/src/content/assets/data/banner_visualisation_data_enriched.csv new file mode 100644 index 0000000000000000000000000000000000000000..d429317e79e2d4c29eae0083fabae777a9720c5c --- /dev/null +++ b/app/src/content/assets/data/banner_visualisation_data_enriched.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:98eba5e5db19f482da8a3b26498c2fa633afa458f5b75e23d2dca24e24cc7596 +size 844651 diff --git a/app/src/content/assets/data/data.json b/app/src/content/assets/data/data.json new file mode 100644 index 0000000000000000000000000000000000000000..ff4c00b0eb62a459d8119023445e5e8d0b4caa64 --- /dev/null +++ b/app/src/content/assets/data/data.json @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:6bbaf02f1b470da41754e3828e81e76ef386d9b3cfb8b57dcc7cbfd4225956cc +size 14121778 diff --git a/app/src/content/assets/data/data_gaia.json b/app/src/content/assets/data/data_gaia.json new file mode 100644 index 0000000000000000000000000000000000000000..ab20cbc449f911f9da3f474b29d72c7bbc2aa15b --- /dev/null +++ b/app/src/content/assets/data/data_gaia.json @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:e69d6199e9f2d0b6b07db5a90f13b2fd9bc9e0e245d2b9aa60ac929967baddfa +size 3967 diff --git a/app/src/content/assets/data/data_gaia_backup.json b/app/src/content/assets/data/data_gaia_backup.json new file mode 100644 index 0000000000000000000000000000000000000000..756e31e09c3b72a7b78e8a49fd406329562a2399 --- /dev/null +++ b/app/src/content/assets/data/data_gaia_backup.json @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:b075362b1571e36a66715126b5223a599ceaca74b0bff0d0d616662fca6a7ff3 +size 335376 diff --git a/app/src/content/assets/data/data_gaia_points.json b/app/src/content/assets/data/data_gaia_points.json new file mode 100644 index 0000000000000000000000000000000000000000..acd7200b83fb6236b8f359b3e78844203b937dd3 --- /dev/null +++ b/app/src/content/assets/data/data_gaia_points.json @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:bc205f03ced31cfbc111bc096a5ba6fe4159f1b8365446dfffb70331d38bfdb8 +size 412546 diff --git a/app/src/content/assets/data/doc-masking_evals.csv b/app/src/content/assets/data/doc-masking_evals.csv new file mode 100644 index 0000000000000000000000000000000000000000..3e3f55b9688ada9e0dfdeaec765e49221b1ab89f --- /dev/null +++ b/app/src/content/assets/data/doc-masking_evals.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:6052fa3042d49a5c2d5b370d3688380220bc612f22f73cff03f1e9354e09a330 +size 26360 diff --git a/app/src/content/assets/data/doc-masking_loss.csv b/app/src/content/assets/data/doc-masking_loss.csv new file mode 100644 index 0000000000000000000000000000000000000000..e872dd8b333e2275b5dea8218709d950ef4eaffd --- /dev/null +++ b/app/src/content/assets/data/doc-masking_loss.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:cacd1a2f14b1c5b5c4342db37759149c5b60b83593bca9e38b7ca451e81cc086 +size 52272 diff --git a/app/src/content/assets/data/font-sprite-mapping.json b/app/src/content/assets/data/font-sprite-mapping.json new file mode 100644 index 0000000000000000000000000000000000000000..3486f5c59de7c5d39807391f6769d5edcad7800a --- /dev/null +++ b/app/src/content/assets/data/font-sprite-mapping.json @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:ea1b487ebafe8d495737674a7eb6492b06551aeb97de79728afeb4aba7c39f29 +size 9766 diff --git a/app/src/content/assets/data/font-sprite.svg b/app/src/content/assets/data/font-sprite.svg new file mode 100644 index 0000000000000000000000000000000000000000..9226d661bbdf56751872b7fb0efc7b807296df58 --- /dev/null +++ b/app/src/content/assets/data/font-sprite.svg @@ -0,0 +1,884 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + \ No newline at end of file diff --git a/app/src/content/assets/data/font_manifest.json b/app/src/content/assets/data/font_manifest.json new file mode 100644 index 0000000000000000000000000000000000000000..0c206b6b35c867bb6e55b88da82e0649f9a0b660 --- /dev/null +++ b/app/src/content/assets/data/font_manifest.json @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7a587c5fd3fb85fdd26d485f57ef3e4feb8370593d46f57289b55f873beac4b4 +size 153794 diff --git a/app/src/content/assets/data/formatting_filters.csv b/app/src/content/assets/data/formatting_filters.csv new file mode 100644 index 0000000000000000000000000000000000000000..afcb8e40eafdcf74d5a1194c39771b25ef7c5878 --- /dev/null +++ b/app/src/content/assets/data/formatting_filters.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:e5218781e5f018891311410d684785a3c661ca3cd25d2ac62bf45e6bb7d69e78 +size 63268 diff --git a/app/src/content/assets/data/fsx_seq_write.json b/app/src/content/assets/data/fsx_seq_write.json new file mode 100644 index 0000000000000000000000000000000000000000..8211d3dad96c5f596993547113d65bce2c3a6121 --- /dev/null +++ b/app/src/content/assets/data/fsx_seq_write.json @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:28480185dfefcea4453e8cb9d6a889b60130e99d4e59c9dba926162d3dcb62d6 +size 8295 diff --git a/app/src/content/assets/data/image_correspondence_filters.csv b/app/src/content/assets/data/image_correspondence_filters.csv new file mode 100644 index 0000000000000000000000000000000000000000..409176cf9b54c834ef062c34ad0908565f169e59 --- /dev/null +++ b/app/src/content/assets/data/image_correspondence_filters.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:64a8af61666421e33d02bf0e52d9df576a6a831677910b3631e8b02069e380a6 +size 60206 diff --git a/app/src/content/assets/data/internal_deduplication.csv b/app/src/content/assets/data/internal_deduplication.csv new file mode 100644 index 0000000000000000000000000000000000000000..a55377c1b56e823092ac474af9d1650b0e8835d4 --- /dev/null +++ b/app/src/content/assets/data/internal_deduplication.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:d6b6bf0d84fe1bc67436c70f9a8d5919627e9c2bc9c3f931f4af80c01be22649 +size 47060 diff --git a/app/src/content/assets/data/leaderboard_scatter_plot.json b/app/src/content/assets/data/leaderboard_scatter_plot.json new file mode 100644 index 0000000000000000000000000000000000000000..6009cb0248915516fdd0dcc568524a99caed83c4 --- /dev/null +++ b/app/src/content/assets/data/leaderboard_scatter_plot.json @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:a7b7b0c94cdd535f66dc6ac532fd80b434eddddd0dc8c36076c462c0a72faa9b +size 7688907 diff --git a/app/src/content/assets/data/leaderboard_scores_over_time.json b/app/src/content/assets/data/leaderboard_scores_over_time.json new file mode 100644 index 0000000000000000000000000000000000000000..675714e3f7696b444c74d82bc07903d576aa63d0 --- /dev/null +++ b/app/src/content/assets/data/leaderboard_scores_over_time.json @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:9c6dea46d998952f9cbfeb219afcdb403b90c28f0645e03ae537dfaa8c82b57d +size 40874 diff --git a/app/src/content/assets/data/leaderboard_scores_over_time_old.json b/app/src/content/assets/data/leaderboard_scores_over_time_old.json new file mode 100644 index 0000000000000000000000000000000000000000..58ae7431310626a99400e70459d2e50ceed86860 --- /dev/null +++ b/app/src/content/assets/data/leaderboard_scores_over_time_old.json @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:9d5b994b4fd6d636b2a58fabced9fcd929427501b86d0937b635a67d34872bb1 +size 55411 diff --git a/app/src/content/assets/data/llm_benchmarks.json b/app/src/content/assets/data/llm_benchmarks.json new file mode 100644 index 0000000000000000000000000000000000000000..2df835086e98803d8a39d558eb44af89758d08ef --- /dev/null +++ b/app/src/content/assets/data/llm_benchmarks.json @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:67f10bd2fdcce7231c40db14b013a587e0dcecf02e427b10bb159669576e9ad8 +size 834 diff --git a/app/src/content/assets/data/mnist-variant-model.json b/app/src/content/assets/data/mnist-variant-model.json new file mode 100644 index 0000000000000000000000000000000000000000..f6d4cc17abe497e553a2ac87687770c7f7047e3f --- /dev/null +++ b/app/src/content/assets/data/mnist-variant-model.json @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7dca86e85be46c1fca6a4e2503786e88e3f8d4609fb7284c8a1479620a5827da +size 4315 diff --git a/app/src/content/assets/data/no-wd_evals.csv b/app/src/content/assets/data/no-wd_evals.csv new file mode 100644 index 0000000000000000000000000000000000000000..f067328ab715b8c711bf019b0f7291a462d54436 --- /dev/null +++ b/app/src/content/assets/data/no-wd_evals.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:b003b8eb447244e804953b26cf991460895928d2b40d64b3db01da8c3971961a +size 42062 diff --git a/app/src/content/assets/data/no_wd_comparison.csv b/app/src/content/assets/data/no_wd_comparison.csv new file mode 100644 index 0000000000000000000000000000000000000000..122ebf035e9a96daa11352c6a21d076e171ec228 --- /dev/null +++ b/app/src/content/assets/data/no_wd_comparison.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7d10e959763dda8b6c9e5a987b2d97af004bd8a51c8680f6a9a18cf84d85da6b +size 83830 diff --git a/app/src/content/assets/data/nope_evals.csv b/app/src/content/assets/data/nope_evals.csv new file mode 100644 index 0000000000000000000000000000000000000000..0208827a32c68fd25cbe36bb78a10782c41dacd1 --- /dev/null +++ b/app/src/content/assets/data/nope_evals.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:aaceaec65082326c22a6a92757b90a37a0cfaa68bc4fa3f96d0e6b6c3d7c1cbb +size 34900 diff --git a/app/src/content/assets/data/nope_loss.csv b/app/src/content/assets/data/nope_loss.csv new file mode 100644 index 0000000000000000000000000000000000000000..47ccd398971111722bdd406b12f0f57bb21ba315 --- /dev/null +++ b/app/src/content/assets/data/nope_loss.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:ef7b55386054b9db1c1e2fbafc2f5b100af2ef8812523a546c40d9db08410a86 +size 69964 diff --git a/app/src/content/assets/data/relevance_filters.csv b/app/src/content/assets/data/relevance_filters.csv new file mode 100644 index 0000000000000000000000000000000000000000..2cce8f33805f443a7f17a99e2d83376d9bac9dc8 --- /dev/null +++ b/app/src/content/assets/data/relevance_filters.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:69acb8bc0b80b2c664d821b1c06d67af315e67d8a706cf9e5d351e4468392cc6 +size 63236 diff --git a/app/src/content/assets/data/remove_ch.csv b/app/src/content/assets/data/remove_ch.csv new file mode 100644 index 0000000000000000000000000000000000000000..733d7d6a4b82735330ad7b46136dbe42f7705956 --- /dev/null +++ b/app/src/content/assets/data/remove_ch.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:869fc4724af7e9c868b6024f472f9ae0f6468b74ef61db101438f80610828abb +size 28837 diff --git a/app/src/content/assets/data/root-seq-write-heatmaps.json b/app/src/content/assets/data/root-seq-write-heatmaps.json new file mode 100644 index 0000000000000000000000000000000000000000..6bdd6673c760f95eb0818ffa92266e87bfee0ab4 --- /dev/null +++ b/app/src/content/assets/data/root-seq-write-heatmaps.json @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:3cd5eec18185386c9641343aa2da610a66fbd38611e4927bb016d369a8fe3972 +size 6651 diff --git a/app/src/content/assets/data/root_seq_write.json b/app/src/content/assets/data/root_seq_write.json new file mode 100644 index 0000000000000000000000000000000000000000..11a6631360b461537729fddbbd5100c0cc6e6fac --- /dev/null +++ b/app/src/content/assets/data/root_seq_write.json @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:5cd3c3121decb1450e43d83a6d4b93145dbb269de1546d2678e3d01710b82ec1 +size 8286 diff --git a/app/src/content/assets/data/s25_ratings.csv b/app/src/content/assets/data/s25_ratings.csv new file mode 100644 index 0000000000000000000000000000000000000000..9eececb8353846ae8c3dbee4171df39ea5a9c209 --- /dev/null +++ b/app/src/content/assets/data/s25_ratings.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:ca22654a0302da0ca335420b0a89cd770cea560b11f2a9f9f25927877d7ed231 +size 61626 diff --git a/app/src/content/assets/data/scratch_seq_write.json b/app/src/content/assets/data/scratch_seq_write.json new file mode 100644 index 0000000000000000000000000000000000000000..6c02010b3f40590d38801f2507c7f965af455542 --- /dev/null +++ b/app/src/content/assets/data/scratch_seq_write.json @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:71a79f782fb57c9d1c2dd19aa8ace5a40c6be7088edfb6afac4c151987fe33e9 +size 8330 diff --git a/app/src/content/assets/data/ss_vs_s1.csv b/app/src/content/assets/data/ss_vs_s1.csv new file mode 100644 index 0000000000000000000000000000000000000000..a56d5aa8a6a94d96d9fcebb65a1f262ac03956c1 --- /dev/null +++ b/app/src/content/assets/data/ss_vs_s1.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:3f076631fcad76129ed8cab03c72a61965b465e1f3e7fa8dc68b7c7a9275616b +size 28041 diff --git a/app/src/content/assets/data/tied-embeddings_evals.csv b/app/src/content/assets/data/tied-embeddings_evals.csv new file mode 100644 index 0000000000000000000000000000000000000000..df2dd70c47839dde56ef92d6841dd8b7ed5e8706 --- /dev/null +++ b/app/src/content/assets/data/tied-embeddings_evals.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c0b6354ce2d0dd8fbe275d6b5701025ec2435a58b30fce92f8600eb9e45e9f0b +size 49810 diff --git a/app/src/content/assets/data/typography_data.json b/app/src/content/assets/data/typography_data.json new file mode 100644 index 0000000000000000000000000000000000000000..9057224d644626db00e2176140d9b8e1db9dd180 --- /dev/null +++ b/app/src/content/assets/data/typography_data.json @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:403e0095f2bcaa963cdfbb8d00a3695565623764d6f87b890bf58d7a2304acfc +size 68739 diff --git a/app/src/content/assets/data/vision.csv b/app/src/content/assets/data/vision.csv new file mode 100644 index 0000000000000000000000000000000000000000..f5ded8db9f405c12478eeb75bf10ae326d83f43a --- /dev/null +++ b/app/src/content/assets/data/vision.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:d28bd13dc3a9ff100c82e8c9dc59270563b865383d09cf28c5aba5812bfa75ee +size 10913 diff --git a/app/src/content/assets/data/visual_dependency_filters.csv b/app/src/content/assets/data/visual_dependency_filters.csv new file mode 100644 index 0000000000000000000000000000000000000000..3cea8673216195618e97ad920d6db48b85412a5e --- /dev/null +++ b/app/src/content/assets/data/visual_dependency_filters.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:a967b10ba4a1034f4d6da250d267a6af51722c3f6dbae0ef0221a62d53502d69 +size 60114 diff --git a/app/src/content/assets/data/zloss_comparison.csv b/app/src/content/assets/data/zloss_comparison.csv new file mode 100644 index 0000000000000000000000000000000000000000..4cbb7d958c991fff0420cb6cf6eb10115a8d9630 --- /dev/null +++ b/app/src/content/assets/data/zloss_comparison.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c0ef8312bf43933601cb226df5b636261410b59f2599baca92be8ef0a612dd6a +size 49739 diff --git a/app/src/content/assets/data/zloss_evals.csv b/app/src/content/assets/data/zloss_evals.csv new file mode 100644 index 0000000000000000000000000000000000000000..c5d635639b83a0cd6b8861f0ae8e19ea96cea8ee --- /dev/null +++ b/app/src/content/assets/data/zloss_evals.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:78dac31df718d071f04e55fb5e2845894cf5b68d989d14a1a568b9aec5ded5c3 +size 22541 diff --git a/app/src/content/assets/finetasks/code.js b/app/src/content/assets/finetasks/code.js new file mode 100644 index 0000000000000000000000000000000000000000..ba9ca62a8067c3083bb6f88b1bc36f8093d8b94f --- /dev/null +++ b/app/src/content/assets/finetasks/code.js @@ -0,0 +1,619 @@ +import Plotly from 'plotly.js-basic-dist-min'; +import Papa from 'papaparse'; +import _ from 'lodash'; +import { getColor } from './colors.mjs'; + +const languageMap = { + 'Arabic': 'ar', + 'Turkish': 'tr', + 'Swahili': 'sw', + 'Russian': 'ru', + 'Telugu': 'te', + 'Thai': 'th', + 'Chinese': 'zh', + 'French': 'fr', + 'Hindi': 'hi' +}; + +const runNameMap = { + "orion": "Dataset-A", + "helios": "Dataset-B", + "lynx": "Dataset-C", + "aquila": "Dataset-D", + "commoncrawl": "CommonCrawl", + "baseline": "Baseline" +}; + +const taskLists = { + ar: ['acva_ara:_average', 'alfgahafa_mlqa_ara_cf', 'alghafa_arc_ara_cf:easy', 'alghafa_facts_ara_cf', 'alghafa_meta_dialects_ara_cf', 'alghafa_mmlu_ara_cf:_average', 'alghafa_openbookqa_ara_cf', 'alghafa_piqa_ara_cf', 'alghafa_race_ara_cf', 'alghafa_rating_sentiment_ara_cf', 'alghafa_rating_sentiment_no_neutral_ara_cf', 'alghafa_sciqa_ara_cf', 'alghafa_sentiment_ara_cf', 'arcd_ara', 'belebele_arb_Arab_cf', 'boolq_ara', 'exams_ara_cf:_average', 'mkqa_ara:_average', 'mlmm_arc_ara_cf:challenge', 'mlmm_hellaswag_ara_cf', 'mlmm_mmlu_ara_cf:_average', 'mlmm_truthfulqa_ara_cf:mc1', 'mlmm_truthfulqa_ara_cf:mc2', 'mlqa_ara', 'mmlu_ara_cf:_average', 'soqal_ara_cf', 'toxigen_ara_cf', 'tydiqa_ara', 'xcodah_ara_cf', 'xcopa_ara_cf', 'xcsqa_ara_cf', 'xnli2.0_ara_cf', 'xnli_ara_cf', 'xquad_ara', 'xstory_cloze_ara_cf'], + fr: ['belebele_fra_Latn_cf', 'community_boolq_fra_cf', 'exams_fra_cf:_average', 'fquadv2_fra', 'frenchbench_arc_fra_cf:challenge', 'frenchbench_hellaswag_fra_cf', 'meta_mmlu_fra_cf:_average', 'mintaka_fra', 'mkqa_fra:_average', 'mlmm_arc_fra_cf:challenge', 'mlmm_hellaswag_fra_cf', 'mlmm_mmlu_fra_cf:_average', 'mlmm_truthfulqa_fra_cf:mc1', 'mlmm_truthfulqa_fra_cf:mc2', 'pawsx_fra_cf', 'xcodah_fra_cf', 'xcsqa_fra_cf', 'xnli2.0_fra_cf', 'xwinograd_fra_cf'], + hi: ['belebele_hin_Deva_cf', 'community_arc_hin_cf:challenge', 'community_arc_hin_cf:easy', 'community_boolq_hin', 'community_hellaswag_hin_cf', 'indicnxnli_hin_cf', 'indicqa_hin', 'indicxcopa_hin_cf', 'meta_mmlu_hin_cf:_average', 'mintaka_hin', 'mlmm_arc_hin_cf:challenge', 'mlmm_hellaswag_hin_cf', 'mlmm_mmlu_hin_cf:_average', 'mlmm_truthfulqa_hin_cf:mc1', 'mlmm_truthfulqa_hin_cf:mc2', 'mlqa_hin', 'xcodah_hin_cf', 'xcsqa_hin_cf', 'xnli2.0_hin_cf', 'xnli_hin_cf', 'xquad_hin', 'xstory_cloze_hin_cf'], + ru: ['belebele_rus_Cyrl_cf', 'chegeka_rus', 'mathlogic_qa_rus_cf', 'mera_openbookqa_rus_cf', 'mera_worldtree_rus_cf', 'mkqa_rus:_average', 'mlmm_arc_rus_cf:challenge', 'mlmm_hellaswag_rus_cf', 'mlmm_mmlu_rus_cf:_average', 'mlmm_truthfulqa_rus_cf:mc1', 'mlmm_truthfulqa_rus_cf:mc2', 'parus_rus_cf', 'rcb_rus_cf', 'rummlu_rus_cf:_average', 'sber_squad_rus', 'tydiqa_rus', 'xcodah_rus_cf', 'xcsqa_rus_cf', 'xnli2.0_rus_cf', 'xquad_rus', 'xstory_cloze_rus_cf', 'xwinograd_rus_cf'], + sw: ['afric_mmlu_swa_cf:_average', 'afric_xnli_swa_cf', 'belebele_swh_Latn_cf', 'community_arc_swa_cf:challenge', 'community_arc_swa_cf:easy', 'community_mmlu_swa_cf', 'kenswquad_swa', 'm3exams_swa_cf', 'openai_mmlu_swa_cf:_average', 'tydiqa_swa', 'xcodah_swa_cf', 'xcopa_swa_cf', 'xcsqa_swa_cf', 'xnli2.0_swa_cf', 'xnli_swa_cf', 'xstory_cloze_swa_cf'], + te: ['belebele_tel_Telu_cf', 'community_hellaswag_tel_cf', 'indicnxnli_tel_cf', 'indicqa_tel', 'indicxcopa_tel_cf', 'mlmm_arc_tel_cf:challenge', 'mlmm_hellaswag_tel_cf', 'mlmm_mmlu_tel_cf:_average', 'mlmm_truthfulqa_tel_cf:mc1', 'mlmm_truthfulqa_tel_cf:mc2', 'tydiqa_tel', 'xstory_cloze_tel_cf'], + th: ['belebele_tha_Thai_cf', 'community_hellaswag_tha_cf', 'm3exams_tha_cf', 'meta_mmlu_tha_cf:_average', 'mkqa_tha:_average', 'thai_exams_tha_cf:_average', 'thai_exams_tha_cf:tgat', 'thaiqa_tha', 'wsci_tha_cf', 'xcopa_tha_cf', 'xnli2.0_tha_cf', 'xnli_tha_cf', 'xquad_tha'], + tr: ['belebele_tur_Latn_cf', 'community_arc_tur_cf:easy', 'community_hellaswag_tur_cf', 'community_mmlu_tur_cf:_average', 'community_truthfulqa_tur_cf:mc1', 'community_truthfulqa_tur_cf:mc2', 'community_xwinograd_tur_cf', 'exams_tur_cf:_average', 'mkqa_tur:_average', 'tquadv2_tur', 'xcopa_tur_cf', 'xnli2.0_tur_cf', 'xnli_tur_cf', 'xquad_tur'], + zh: ['agieval_zho_cf:_average', 'belebele_zho_Hans_cf', 'c3_zho_cf', 'ceval_zho_cf:_average', 'chinese_squad_zho', 'cmath_zho_cf', 'cmmlu_zho_cf:_average', 'cmnli_zho_cf', 'cmrc2018_zho', 'm3exams_zho_cf', 'mkqa_zho:_average', 'mlmm_arc_zho_cf:challenge', 'mlmm_hellaswag_zho_cf', 'mlmm_mmlu_zho_cf:_average', 'mlmm_truthfulqa_zho_cf:mc1', 'mlmm_truthfulqa_zho_cf:mc2', 'ocnli_zho_cf', 'pawsx_zho_cf', 'xcodah_zho_cf', 'xcopa_zho_cf', 'xcsqa_zho_cf', 'xnli2.0_zho_cf', 'xnli_zho_cf', 'xquad_zho', 'xstory_cloze_zho_cf', 'xwinograd_zho_cf'] +}; + +const LINE_SETTINGS = { + width: 2.5, + type: "scatter", + mode: "lines+markers", +}; + +const DEFAULT_LAYOUT = { + font: { + family: "apple-system, Arial, sans-serif", + }, + title: { + font: { + size: 15, + }, + }, + xaxis: { + title: { + text: "Training Tokens (billions)", + font: { + size: 14, + }, + }, + tickfont: { + size: 12, + }, + showgrid: false, + mirror: true, + ticks: "outside", + showline: true, + }, + yaxis: { + title: { + font: { + size: 14, + }, + standoff: 10, + }, + showgrid: false, + mirror: true, + ticks: "outside", + showline: true, + tickfont: { + size: 12, + }, + }, + height: 300, // You can adjust this value + autosize: true, + legend: { + orientation: 'h', // Set to 'h' for horizontal legend (required for columns) + yanchor: 'bottom', + y: 0, // Position at the bottom + xanchor: 'right', + x: 1, // Position at the right + traceorder: 'normal', + font: { size: 12 }, + tracegroupgap: 0, // Space between legend items + bgcolor: 'rgba(255, 255, 255, 0.8)' // White background with 70% transparency (1 - 0.3 = 70%) + }, + margin: { + t: 25, + b: 60, + l: 60, + r: 40, + }, +}; + +export function initPlotApplets() { + const plotContainers = document.querySelectorAll('.task-signal-plot'); + plotContainers.forEach(container => { + initPlotApplet(container); + }); +} + +function initPlotApplet(container) { + const defaultLanguage = container.dataset.language || 'Arabic'; + const defaultTask = container.dataset.task || ''; + const defaultMetric = container.dataset.metric || ''; + const groupSeeds = container.dataset.groupSeeds === 'true'; + const showControls = container.dataset.showControls === 'true'; + const taskMetrics = (container.dataset.taskMetrics || 'monotonicity,snr,ordering,randomness').split(","); + + const controls = createControls(container, defaultLanguage, defaultTask, defaultMetric, taskMetrics); + if (!showControls) + controls.style.display = 'none'; + container.appendChild(controls); + + const plotContainer = document.createElement('div'); + plotContainer.className = 'plot-container'; + container.appendChild(plotContainer); + + const statsContainer = document.createElement('div'); + statsContainer.className = 'stats-container'; + container.appendChild(statsContainer); + + + // Create an initial empty plot + Plotly.newPlot(plotContainer, []); + + // Set up the resize function + const resizePlot = () => { + const width = container.offsetWidth; + Plotly.relayout(plotContainer, { width: width }); + }; + + // Add resize listener + window.addEventListener('resize', resizePlot); + + // Initial resize + resizePlot(); + + // Load the initial data + updateLanguageTasks(container, defaultTask, defaultMetric, groupSeeds, taskMetrics); +} + +function createControls(container, defaultLanguage, defaultTask, defaultMetric, taskMetrics) { + const controls = document.createElement('div'); + controls.className = 'controls'; + + const languageSelect = createSelect('language', Object.keys(languageMap), () => updateLanguageTasks(container, '', '', true, taskMetrics)); + languageSelect.value = defaultLanguage; + + const taskSelect = createSelect('task', [], () => updateMetrics(container, '', true, taskMetrics)); + const metricSelect = createSelect('metric', [], () => updatePlot(container, taskMetrics)); + + controls.appendChild(createControlGroup('Language:', languageSelect)); + controls.appendChild(createControlGroup('Task:', taskSelect)); + controls.appendChild(createControlGroup('Metric:', metricSelect)); + + return controls; +} + +function createSelect(id, options, onChangeHandler) { + const select = document.createElement('select'); + select.id = id; + options.forEach(option => { + const optionElement = document.createElement('option'); + optionElement.value = option; + optionElement.textContent = option; + select.appendChild(optionElement); + }); + select.addEventListener('change', onChangeHandler); + return select; +} + +function createControlGroup(labelText, inputElement) { + const group = document.createElement('div'); + group.className = 'control-group'; + + const label = document.createElement('label'); + label.textContent = labelText; + label.className = 'control-label'; + + group.appendChild(label); + group.appendChild(inputElement); + + return group; +} + +async function updateLanguageTasks(container, defaultTask = '', defaultMetric = '', groupSeeds, taskMetrics) { + const languageSelect = container.querySelector('#language'); + const taskSelect = container.querySelector('#task'); + const language = languageSelect.value; + const langCode = languageMap[language]; + + taskSelect.innerHTML = ''; + + try { + const tasks = await getTasksForLanguage(langCode); + + taskSelect.innerHTML = ''; + if (tasks.length > 0) { + tasks.forEach(task => { + const option = document.createElement('option'); + option.value = task; + option.textContent = truncateText(task, 25); // Reduced from 30 to 25 + option.title = task; // Set full task name as title for tooltip + taskSelect.appendChild(option); + }); + + if (defaultTask && tasks.includes(defaultTask)) { + taskSelect.value = defaultTask; + } else { + taskSelect.selectedIndex = 0; + } + + await updateMetrics(container, defaultMetric, groupSeeds, taskMetrics); + } else { + taskSelect.innerHTML = ''; + clearPlot(container); + } + } catch (error) { + console.error('Error fetching tasks:', error); + taskSelect.innerHTML = ''; + clearPlot(container); + } +} + +async function getTasksForLanguage(langCode) { + return taskLists[langCode] || []; +} + +async function updateMetrics(container, defaultMetric = '', groupSeeds, taskMetrics) { + const language = container.querySelector('#language').value; + const task = container.querySelector('#task').value; + const langCode = languageMap[language]; + const metricSelect = container.querySelector('#metric'); + + metricSelect.innerHTML = ''; + + try { + const metrics = await getMetricsForTask(langCode, task); + + metricSelect.innerHTML = ''; + metrics.forEach(metric => { + const option = document.createElement('option'); + option.value = metric; + option.textContent = metric; + metricSelect.appendChild(option); + }); + + if (defaultMetric && metrics.includes(defaultMetric)) { + metricSelect.value = defaultMetric; + } else if (metricSelect.options.length > 0) { + metricSelect.selectedIndex = 0; + } + + await updatePlot(container, taskMetrics); + } catch (error) { + console.error('Error fetching metrics:', error); + metricSelect.innerHTML = ''; + clearPlot(container); + } +} + +async function getMetricsForTask(langCode, task) { + return new Promise((resolve, reject) => { + Papa.parse(`data/${langCode}/${task}_stats.csv`, { + download: true, + header: true, + complete: function(results) { + const metrics = [...new Set(results.data.map(row => row.metric).filter(metric => metric))]; + resolve(metrics); + }, + error: function(error) { + console.error('Error fetching metrics:', error); + reject(error); + } + }); + }); +} + +function updatePlot(container, taskMetrics) { + const language = container.querySelector('#language').value; + const task = container.querySelector('#task').value; + const metric = container.querySelector('#metric').value; + const title = container.dataset.title; + const langCode = languageMap[language]; + + if (!langCode || !task || !metric) { + clearPlot(container); + return; + } + + const dataUrl = `data/${langCode}/${task}_data.csv`; + const statsUrl = `data/${langCode}/${task}_stats.csv`; + + Promise.all([ + new Promise((resolve, reject) => { + Papa.parse(dataUrl, { + download: true, + header: true, + dynamicTyping: true, + complete: resolve, + error: reject + }); + }), + new Promise((resolve, reject) => { + Papa.parse(statsUrl, { + download: true, + header: true, + dynamicTyping: true, + complete: resolve, + error: reject + }); + }) + ]).then(([dataResult, statsResult]) => { + const taskData = dataResult.data; + const statsData = statsResult.data; + plotData(container, taskData, statsData, metric, title, taskMetrics); + }).catch(error => { + console.error('Error parsing CSV:', error); + clearPlot(container); + }); +} + +function plotData(container, data, stats, metric, title, taskMetrics) { + const groupSeeds = container.dataset.groupSeeds === 'true'; + const sortedData = sortDataByTokens(data); + const groupedData = groupDataByRunname(sortedData, groupSeeds, metric); + const interpolatedData = interpolateData(groupedData, metric); + const smoothedData = smoothData(interpolatedData, metric); + const traces = createTraces(smoothedData, metric); + + const plotContainer = container.querySelector('.plot-container'); + + const layout = _.merge({}, DEFAULT_LAYOUT, { + title: { text: `${title}` }, + xaxis: { + title: { text: 'Training Tokens (billions)' }, + tickvals: [0, 5, 10, 15, 20, 25], + ticktext: ['0', '5B', '10B', '15B', '20B', '25B'], + tickangle: 45, + range: [0, 30], // Set the range to start from 0 and end at 30B + }, + yaxis: { + title: { text: 'Score' }, + range: [Math.min(...traces.flatMap(trace => trace.y)) * 0.95, Math.max(...traces.flatMap(trace => trace.y)) * 1.05], // Add 5% padding to the top and bottom + }, + width: container.offsetWidth, + }); + + Plotly.newPlot(plotContainer, traces, layout, {responsive: true}); + + // Display statistics + displayStatistics(container, stats, metric, taskMetrics); +} + +function displayStatistics(container, stats, metric, taskMetrics) { + const statsContainer = container.querySelector('.stats-container'); + const metricStats = stats.find(stat => stat.metric === metric); + if (metricStats) { + statsContainer.innerHTML = ` +
        + ${taskMetrics.includes('monotonicity') ? 'Monotonicity: ' + metricStats.avg_spearman.toFixed(2) + '' : ''} + ${taskMetrics.includes('snr') ? 'Signal-to-Noise: ' + metricStats.avg_snr.toFixed(2) + '' : ''} + ${taskMetrics.includes('ordering') ? 'Ordering Consistency: ' + metricStats.avg_kendall_tau_a.toFixed(2) + '' : ''} + ${taskMetrics.includes('randomness') ? 'Non-Randomness: ' + metricStats.max_n_std.toFixed(2) + '' : ''} +
        + `; + } else { + statsContainer.innerHTML = '

        No statistics available for this metric.

        '; + } +} + +function getReducedTickValues(tokens) { + const uniqueTokens = [...new Set(tokens)].sort((a, b) => a - b); + const tokenCount = uniqueTokens.length; + const targetTickCount = 10; // Adjust this value to increase/decrease the number of ticks + + if (tokenCount <= targetTickCount) { + return uniqueTokens; + } + + const stride = Math.ceil(tokenCount / targetTickCount); + return uniqueTokens.filter((_, index) => index % stride === 0); +} + +function formatTickLabel(value) { + if (value >= 1e9) { + return (value / 1e9).toFixed(1) + 'B'; + } else if (value >= 1e6) { + return (value / 1e6).toFixed(1) + 'M'; + } else if (value >= 1e3) { + return (value / 1e3).toFixed(1) + 'K'; + } + return value.toString(); +} + +function computeStatistics(data, metric) { + const stats = { + avg_spearman: 0, + avg_kendall_tau_a: 0, + avg_snr: 0, + max_n_std: 0 + }; + + const baselineRun = Object.keys(data).find(key => key.toLowerCase().includes('baseline')); + const nonBaselineRuns = Object.keys(data).filter(key => key !== baselineRun); + + // Compute statistics for each non-baseline run + nonBaselineRuns.forEach(run => { + const runData = data[run]; + const tokens = runData.map(row => row.tokens); + const scores = runData.map(row => row[metric]); + + // Spearman correlation + stats.avg_spearman += spearmanCorrelation(tokens, scores); + + // Kendall Tau-a + const lastHalf = Math.floor(runData.length / 2); + const kendallTauValues = []; + for (let i = lastHalf; i < runData.length - 1; i++) { + kendallTauValues.push(kendallTauA(scores.slice(0, i + 1), scores.slice(0, i + 2))); + } + stats.avg_kendall_tau_a += _.mean(kendallTauValues); + + // SNR and max_n_std + if (baselineRun) { + const baselineScores = data[baselineRun].map(row => row[metric]); + const stdDev = standardDeviation(scores); + stats.avg_snr += _.mean(scores) / stdDev; + stats.max_n_std = Math.max(stats.max_n_std, (_.max(scores) - _.mean(baselineScores)) / stdDev); + } + }); + + // Average the statistics + const numRuns = nonBaselineRuns.length; + stats.avg_spearman /= numRuns; + stats.avg_kendall_tau_a /= numRuns; + stats.avg_snr /= numRuns; + + return stats; +} + +function spearmanCorrelation(x, y) { + const n = x.length; + const rankX = rankData(x); + const rankY = rankData(y); + + let sum_d_squared = 0; + for (let i = 0; i < n; i++) { + const d = rankX[i] - rankY[i]; + sum_d_squared += d * d; + } + + return 1 - (6 * sum_d_squared) / (n * (n * n - 1)); +} + +function rankData(data) { + const sorted = [...data].sort((a, b) => a - b); + return data.map(x => sorted.indexOf(x) + 1); +} + +function kendallTauA(x, y) { + const n = x.length; + let concordant = 0; + let discordant = 0; + + for (let i = 0; i < n; i++) { + for (let j = i + 1; j < n; j++) { + const sign_x = Math.sign(x[j] - x[i]); + const sign_y = Math.sign(y[j] - y[i]); + if (sign_x * sign_y > 0) concordant++; + else if (sign_x * sign_y < 0) discordant++; + } + } + + return (concordant - discordant) / (n * (n - 1) / 2); +} + +function standardDeviation(values) { + const mean = _.mean(values); + const squareDiffs = values.map(value => { + const diff = value - mean; + return diff * diff; + }); + const avgSquareDiff = _.mean(squareDiffs); + return Math.sqrt(avgSquareDiff); +} + +function interpolateData(data, metric) { + return _.mapValues(data, (rows) => { + const sortedRows = _.sortBy(rows, 'tokens'); + const allTokens = _.uniq(_.flatMap(Object.values(data), rows => rows.map(r => r.tokens))).sort((a, b) => a - b); + + return allTokens.map(token => { + const exactMatch = _.find(sortedRows, { tokens: token }); + if (exactMatch) return exactMatch; + + const lowerRow = _.findLast(sortedRows, r => r.tokens < token); + const upperRow = _.find(sortedRows, r => r.tokens > token); + + if (!lowerRow) return { ...upperRow, tokens: token }; + if (!upperRow) return { ...lowerRow, tokens: token }; + + const ratio = (token - lowerRow.tokens) / (upperRow.tokens - lowerRow.tokens); + const interpolatedMetric = lowerRow[metric] + (upperRow[metric] - lowerRow[metric]) * ratio; + + return { + ...lowerRow, + tokens: token, + [metric]: interpolatedMetric + }; + }); + }); +} + +function smoothData(data, metric, windowSize = 3) { + return _.mapValues(data, (rows) => { + return rows.map((row, index, array) => { + const window = array.slice(Math.max(0, index - windowSize + 1), index + 1); + const smoothedMetric = _.meanBy(window, r => r[metric]); + return { ...row, [metric]: smoothedMetric }; + }); + }); +} + +function sortDataByTokens(data) { + return _.sortBy(data, 'tokens'); +} + +function groupDataByRunname(data, groupSeeds, metric) { + // Remove null or undefined runs + data = data.filter(row => row.runname != null && row.runname !== 'null_undefined'); + + if (!groupSeeds) { + return _.groupBy(data, row => `${processRunName(row.runname)}_${row.seed}`); + } + + const grouped = _.groupBy(data, row => processRunName(row.runname)); + + return _.mapValues(grouped, (rows) => { + const stepGroups = _.groupBy(rows, 'tokens'); + return _.map(stepGroups, (stepRows) => { + const meanMetric = _.meanBy(stepRows, row => parseFloat(row[metric]) || 0); + return { + ...stepRows[0], + [metric]: meanMetric + }; + }); + }); +} + +function processRunName(runname) { + for (const [key, value] of Object.entries(runNameMap)) { + if (runname.includes(key)) { + return value; + } + } + return runname; +} + +function createTraces(groupedData, metric) { + const colorsMapping = new Map(); + const sortedRunnames = Object.keys(groupedData).sort((a, b) => { + if (a.includes('baseline')) return 1; + if (b.includes('baseline')) return -1; + return a.localeCompare(b); + }); + + return sortedRunnames.map((runname, index) => { + const color = getColorForTrace(runname, colorsMapping, index); + return { + x: groupedData[runname].map(row => row.tokens), + y: groupedData[runname].map(row => row[metric]), + name: runname, + line: { + color: color, + shape: 'spline', + ...LINE_SETTINGS + }, + marker: { + color: color, + size: 6, + }, + mode: 'lines+markers', + }; + }); +} + +function getColorForTrace(traceName, colorsMapping, index) { + const reusedColor = colorsMapping.get(traceName); + if (reusedColor) { + return reusedColor; + } + + const color = getColor(index); + colorsMapping.set(traceName, color); + return color; +} + +function clearPlot(container) { + const plotContainer = container.querySelector('.plot-container'); + Plotly.purge(plotContainer); +} + +function truncateText(text, maxLength) { + if (text.length <= maxLength) return text; + return text.substr(0, maxLength - 2) + '..'; +} + diff --git a/app/src/content/assets/finetasks/data/ar/acva_ara:_average_data.csv b/app/src/content/assets/finetasks/data/ar/acva_ara:_average_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..6fb7365d0363ca6030368a7d4a92dfc022dc7c42 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/acva_ara:_average_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:ee86019600943234de0d00cb7f2cfb5f08adea529e281c47fb11ab39e904fa14 +size 26104 diff --git a/app/src/content/assets/finetasks/data/ar/acva_ara:_average_stats.csv b/app/src/content/assets/finetasks/data/ar/acva_ara:_average_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..5cd809e1fb6e85cc7c84085f1708091351802fb2 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/acva_ara:_average_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:79551f7eeb2579538604681929741203205e6150f95187ea5319e3e9671f634e +size 1078 diff --git a/app/src/content/assets/finetasks/data/ar/alfgahafa_mlqa_ara_cf_data.csv b/app/src/content/assets/finetasks/data/ar/alfgahafa_mlqa_ara_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..d0da12bc63cbb23ab882b7121208ddd17e94e884 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/alfgahafa_mlqa_ara_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:0753a9fb838808ff6855bfcce87eb7d716d406dff82985e64bd72abf3e0eeed6 +size 20564 diff --git a/app/src/content/assets/finetasks/data/ar/alfgahafa_mlqa_ara_cf_stats.csv b/app/src/content/assets/finetasks/data/ar/alfgahafa_mlqa_ara_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..30e1f65d07051a69136d9f7ab8e4fe80cc389aac --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/alfgahafa_mlqa_ara_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:b42cd429953188f1e3a2f61a3bbcd3aa669421bac407a5f2843b9ad3bc287b9b +size 903 diff --git a/app/src/content/assets/finetasks/data/ar/alghafa_arc_ara_cf:easy_data.csv b/app/src/content/assets/finetasks/data/ar/alghafa_arc_ara_cf:easy_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..c62a92b1d12efeec68c3cd8038985450c448cc93 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/alghafa_arc_ara_cf:easy_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:810a68eb754b4f0a3acae2a34c311676c78d926fd88e34e1c0bb9be949e3aa20 +size 18155 diff --git a/app/src/content/assets/finetasks/data/ar/alghafa_arc_ara_cf:easy_stats.csv b/app/src/content/assets/finetasks/data/ar/alghafa_arc_ara_cf:easy_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..c22fd3f6406ab3b07ace1a7e0f7fd4b664443add --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/alghafa_arc_ara_cf:easy_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:642e379750d340963d86ff023426787891d7cb494bf135c33be48c0c9897519f +size 908 diff --git a/app/src/content/assets/finetasks/data/ar/alghafa_exams_ara_cf:_average_data.csv b/app/src/content/assets/finetasks/data/ar/alghafa_exams_ara_cf:_average_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..9bde66d05dde1706503eb18fb0616654ff506718 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/alghafa_exams_ara_cf:_average_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:cf7f6df15ba9e2c552c721bed4d292cf75a8bf6b3f3cbd5f65c9903b99e463d0 +size 24386 diff --git a/app/src/content/assets/finetasks/data/ar/alghafa_exams_ara_cf:_average_stats.csv b/app/src/content/assets/finetasks/data/ar/alghafa_exams_ara_cf:_average_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..d78ca7468d56a47d5f1768b302c3f954c6bd44b3 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/alghafa_exams_ara_cf:_average_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:9b25491622030909b5b075cd7744fcad61fabe7103253c14355710762cbdc6d6 +size 928 diff --git a/app/src/content/assets/finetasks/data/ar/alghafa_facts_ara_cf_data.csv b/app/src/content/assets/finetasks/data/ar/alghafa_facts_ara_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..7e8b05fe8696b4c0b6f2adcbb327e37090e73615 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/alghafa_facts_ara_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:b3b1f369ae9a64e27702437a049456d90fff09c62133a0232cd146a19bfb1bba +size 17318 diff --git a/app/src/content/assets/finetasks/data/ar/alghafa_facts_ara_cf_stats.csv b/app/src/content/assets/finetasks/data/ar/alghafa_facts_ara_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..09862d0351749673580bc02c424bfa67ed6cbe91 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/alghafa_facts_ara_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:19159aa9195f4a9eebbc6a91431d65dce630ec97edf39e32fe3c0f8dc302e546 +size 834 diff --git a/app/src/content/assets/finetasks/data/ar/alghafa_meta_dialects_ara_cf_data.csv b/app/src/content/assets/finetasks/data/ar/alghafa_meta_dialects_ara_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..beb7ab4fc6ebb976bd3a62e2d25bf7e124756a1e --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/alghafa_meta_dialects_ara_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:714e3326ff11bfebb268366315fa4b5cf305b9cb8174c451db33773f5ac88d78 +size 18138 diff --git a/app/src/content/assets/finetasks/data/ar/alghafa_meta_dialects_ara_cf_stats.csv b/app/src/content/assets/finetasks/data/ar/alghafa_meta_dialects_ara_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..5713df0dc64c87b5cdfd279198f10d1cb50817b6 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/alghafa_meta_dialects_ara_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:e7a37090de73b4fc41f1011e031ad56ff95e3883662275daf8c67656e166b5f9 +size 935 diff --git a/app/src/content/assets/finetasks/data/ar/alghafa_mmlu_ara_cf:_average_data.csv b/app/src/content/assets/finetasks/data/ar/alghafa_mmlu_ara_cf:_average_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..8643eda0b8c39deef193aeedc4c5aa3f5c410351 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/alghafa_mmlu_ara_cf:_average_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:6959492567e052a2f9251d092f449dda7ed9118daca1441f5c146e6d2761e10c +size 23032 diff --git a/app/src/content/assets/finetasks/data/ar/alghafa_mmlu_ara_cf:_average_stats.csv b/app/src/content/assets/finetasks/data/ar/alghafa_mmlu_ara_cf:_average_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..5eb7b3df4dc3d62220c43de44bc838f51a41e18f --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/alghafa_mmlu_ara_cf:_average_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:3f579eac42e69687634fa0d27ca738a80d5fd854b4dc33ef069210def32a7394 +size 937 diff --git a/app/src/content/assets/finetasks/data/ar/alghafa_openbookqa_ara_cf_data.csv b/app/src/content/assets/finetasks/data/ar/alghafa_openbookqa_ara_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..f5c5dc24136af997a10345f4c0eb638ae7db3c54 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/alghafa_openbookqa_ara_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:9fe35704efef4b670a7f3ec6f64d64aa2e14f387caaa12b7c5da0eda18c4078a +size 22998 diff --git a/app/src/content/assets/finetasks/data/ar/alghafa_openbookqa_ara_cf_stats.csv b/app/src/content/assets/finetasks/data/ar/alghafa_openbookqa_ara_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..d824d2acd38f2ae28d768dbae78b5a123a9f0de1 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/alghafa_openbookqa_ara_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:5e137221596d32ecd3f77c2f451c87e8ff8743a286816f974e7457290a2cfaec +size 925 diff --git a/app/src/content/assets/finetasks/data/ar/alghafa_piqa_ara_cf_data.csv b/app/src/content/assets/finetasks/data/ar/alghafa_piqa_ara_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..d8ceabc456bea15f4e9aa6e89c560123950d9a95 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/alghafa_piqa_ara_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:a436d5d6e494aa37f2f9c4e4f14c2376d97d1c48ed116ca4f9c6f65caf0fbc3f +size 18478 diff --git a/app/src/content/assets/finetasks/data/ar/alghafa_piqa_ara_cf_stats.csv b/app/src/content/assets/finetasks/data/ar/alghafa_piqa_ara_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..ee89f440518ef33a052878cd20c7e1dff85410f7 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/alghafa_piqa_ara_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:5f01a72195ce586f94d8dad8210e5accd5459bf83712ad968b149701dfe4b9e8 +size 880 diff --git a/app/src/content/assets/finetasks/data/ar/alghafa_race_ara_cf_data.csv b/app/src/content/assets/finetasks/data/ar/alghafa_race_ara_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..09c60c700d033ce8a020be7ec66b8489a7d4b33f --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/alghafa_race_ara_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:80b15aea8264a8bdac278cf2ac0d07cbcc0e8c7ccac150ace4e26dd65471e6fc +size 18432 diff --git a/app/src/content/assets/finetasks/data/ar/alghafa_race_ara_cf_stats.csv b/app/src/content/assets/finetasks/data/ar/alghafa_race_ara_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..182fa98b229bff13917e64799829263dab02f6c0 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/alghafa_race_ara_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:80ae78c9142cf89d1be55977578c2ca041838a5dbaa736b40958a13730e46ae9 +size 893 diff --git a/app/src/content/assets/finetasks/data/ar/alghafa_rating_sentiment_ara_cf_data.csv b/app/src/content/assets/finetasks/data/ar/alghafa_rating_sentiment_ara_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..98f0eaf2d01218821b7bd29110404748f347bbfa --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/alghafa_rating_sentiment_ara_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:bdc28b21863e88fcc8fac6245d25da05db90d7da234708b341636400b2584769 +size 18023 diff --git a/app/src/content/assets/finetasks/data/ar/alghafa_rating_sentiment_ara_cf_stats.csv b/app/src/content/assets/finetasks/data/ar/alghafa_rating_sentiment_ara_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..40fbd4ae6895d819e06eca6edcbbe7c3c721ff6a --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/alghafa_rating_sentiment_ara_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:3291cd3da2406bdcc358764cf42ae1a21cf9d1c07b4daf5998e7695a09317c37 +size 936 diff --git a/app/src/content/assets/finetasks/data/ar/alghafa_rating_sentiment_no_neutral_ara_cf_data.csv b/app/src/content/assets/finetasks/data/ar/alghafa_rating_sentiment_no_neutral_ara_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..0a773ae59b909d4e59cfab199d4554e98aa027e0 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/alghafa_rating_sentiment_no_neutral_ara_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:ab37a3551307dadb3260bcd768b554954b7451493ce1e07bbaf8465d25f3e09b +size 16661 diff --git a/app/src/content/assets/finetasks/data/ar/alghafa_rating_sentiment_no_neutral_ara_cf_stats.csv b/app/src/content/assets/finetasks/data/ar/alghafa_rating_sentiment_no_neutral_ara_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..d8bcd5efaffd3f44d40f0a69c9230ff88178a31d --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/alghafa_rating_sentiment_no_neutral_ara_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:20a08683aa5cdd6a8a64efcffcf942a4a26dcf2550a2d770b0ff46b8c40ecffe +size 970 diff --git a/app/src/content/assets/finetasks/data/ar/alghafa_sciqa_ara_cf_data.csv b/app/src/content/assets/finetasks/data/ar/alghafa_sciqa_ara_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..2e6ac8ee4c0e91c9942ee456e3c1f1eaecf0efb9 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/alghafa_sciqa_ara_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:a29f2dcacf4f8d53096dddfbeb1f5cb8eb8e8a5354dc2f979a1908c5541ebcd6 +size 23819 diff --git a/app/src/content/assets/finetasks/data/ar/alghafa_sciqa_ara_cf_stats.csv b/app/src/content/assets/finetasks/data/ar/alghafa_sciqa_ara_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..b9eac59cb738836cfbb2c3806b8b5abee7dca070 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/alghafa_sciqa_ara_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:e6cac24425904dec66f4bb38aa34d65d0c1a6bc539baf5ea5300c5f7bc362626 +size 894 diff --git a/app/src/content/assets/finetasks/data/ar/alghafa_sentiment_ara_cf_data.csv b/app/src/content/assets/finetasks/data/ar/alghafa_sentiment_ara_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..a56bc0b819a002ac534acf074c61ba912c3db393 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/alghafa_sentiment_ara_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c69085a0ca2df0adc4f9ec3c3b9857adad82d6749f41d88e3d43ba16e6d936d3 +size 17942 diff --git a/app/src/content/assets/finetasks/data/ar/alghafa_sentiment_ara_cf_stats.csv b/app/src/content/assets/finetasks/data/ar/alghafa_sentiment_ara_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..f70c676e779f589fcf31b0ab9f3172ff53534988 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/alghafa_sentiment_ara_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:b1db01058cac603bcb5b0a991b39d9499a3537ab0da1ad36eb1c3b317c8d5ff4 +size 903 diff --git a/app/src/content/assets/finetasks/data/ar/arcd_ara_data.csv b/app/src/content/assets/finetasks/data/ar/arcd_ara_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..1064eef317e47b765cc3db58bc931c5c9e15b896 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/arcd_ara_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7f5b91b32d5c9a58a34ac035fd9e880de1256f5d0c47edfed7fe591abed789fa +size 15849 diff --git a/app/src/content/assets/finetasks/data/ar/arcd_ara_stats.csv b/app/src/content/assets/finetasks/data/ar/arcd_ara_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..47b970eed050312dab739d82cbdcdf47c58df321 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/arcd_ara_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:3d15a351c2e4cf3dcc3372637baafd4821397bb7ab00c81704d91ec8b55e6a31 +size 478 diff --git a/app/src/content/assets/finetasks/data/ar/belebele_arb_Arab_cf_data.csv b/app/src/content/assets/finetasks/data/ar/belebele_arb_Arab_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..3cbfbd9db316328aaef15ad2f0ec30556dfc0603 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/belebele_arb_Arab_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:6464f54a9a933b4b47c9c513c907ae358909518998ad5db01d8580578b77a1c6 +size 23912 diff --git a/app/src/content/assets/finetasks/data/ar/belebele_arb_Arab_cf_stats.csv b/app/src/content/assets/finetasks/data/ar/belebele_arb_Arab_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..4d88c3054f5aa4c549c66093a04ef495cfaa1687 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/belebele_arb_Arab_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:fda4577c55a3298b9dc6f6a2e148aaf6a4038f8604811ca339f88d3c3f6e7573 +size 903 diff --git a/app/src/content/assets/finetasks/data/ar/boolq_ara_data.csv b/app/src/content/assets/finetasks/data/ar/boolq_ara_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..299a534a38ed3f948a31997f5d0a93d87a27ec75 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/boolq_ara_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:54675b0af158ca756a5c08ea1b6315f757df93827a3a107b4208b135bdf6d8db +size 18834 diff --git a/app/src/content/assets/finetasks/data/ar/boolq_ara_stats.csv b/app/src/content/assets/finetasks/data/ar/boolq_ara_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..ca9156794d864cde62191083d08714d94dadb0d4 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/boolq_ara_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:32457679a2dc38045522dab504c13590c7199bcb452f65acfcf337fbbb3bdc2c +size 1042 diff --git a/app/src/content/assets/finetasks/data/ar/community_arc_hin_cf:challenge_data.csv b/app/src/content/assets/finetasks/data/ar/community_arc_hin_cf:challenge_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..7529a61c7e2a97605acd93e055286eadc50efb91 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/community_arc_hin_cf:challenge_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:2649f9802e39e19da555d2e42851281cca18826534d23246bffd8b15a43e326a +size 14390 diff --git a/app/src/content/assets/finetasks/data/ar/community_arc_hin_cf:challenge_stats.csv b/app/src/content/assets/finetasks/data/ar/community_arc_hin_cf:challenge_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..866745da55b886c8091d4f124341d4ee1c9a6698 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/community_arc_hin_cf:challenge_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:aabcf0d879390556fa664c0fb532afa47580407e37f2552026dbfffab89ebf57 +size 469 diff --git a/app/src/content/assets/finetasks/data/ar/community_arc_hin_cf:easy_data.csv b/app/src/content/assets/finetasks/data/ar/community_arc_hin_cf:easy_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..7529a61c7e2a97605acd93e055286eadc50efb91 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/community_arc_hin_cf:easy_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:2649f9802e39e19da555d2e42851281cca18826534d23246bffd8b15a43e326a +size 14390 diff --git a/app/src/content/assets/finetasks/data/ar/community_arc_hin_cf:easy_stats.csv b/app/src/content/assets/finetasks/data/ar/community_arc_hin_cf:easy_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..4672e3e514457a3d2b5d784e72281c56d2109c49 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/community_arc_hin_cf:easy_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:d213b818d04764e3bd9a0d0ad57ab9e0e38fe26d8db51942c12c1b8eb92f3636 +size 449 diff --git a/app/src/content/assets/finetasks/data/ar/community_arc_swa_cf:challenge_data.csv b/app/src/content/assets/finetasks/data/ar/community_arc_swa_cf:challenge_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..7529a61c7e2a97605acd93e055286eadc50efb91 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/community_arc_swa_cf:challenge_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:2649f9802e39e19da555d2e42851281cca18826534d23246bffd8b15a43e326a +size 14390 diff --git a/app/src/content/assets/finetasks/data/ar/community_arc_swa_cf:challenge_stats.csv b/app/src/content/assets/finetasks/data/ar/community_arc_swa_cf:challenge_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..b48b834b8f443ec7ca7af2e3f08089e23dbb29a5 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/community_arc_swa_cf:challenge_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:a7f3601c8b4750a98708960e1df4c574b2038821e4a5740837d52ad770bbbd3c +size 469 diff --git a/app/src/content/assets/finetasks/data/ar/community_arc_swa_cf:easy_data.csv b/app/src/content/assets/finetasks/data/ar/community_arc_swa_cf:easy_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..7529a61c7e2a97605acd93e055286eadc50efb91 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/community_arc_swa_cf:easy_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:2649f9802e39e19da555d2e42851281cca18826534d23246bffd8b15a43e326a +size 14390 diff --git a/app/src/content/assets/finetasks/data/ar/community_arc_swa_cf:easy_stats.csv b/app/src/content/assets/finetasks/data/ar/community_arc_swa_cf:easy_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..48e32f93e9efb5e359a28e1ff426791e72225217 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/community_arc_swa_cf:easy_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:f25f52a5fe47096977b9dd294e354c65f54225ec87a7cde264933b5229ca0a67 +size 449 diff --git a/app/src/content/assets/finetasks/data/ar/community_arc_tur_cf:easy_data.csv b/app/src/content/assets/finetasks/data/ar/community_arc_tur_cf:easy_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..7529a61c7e2a97605acd93e055286eadc50efb91 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/community_arc_tur_cf:easy_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:2649f9802e39e19da555d2e42851281cca18826534d23246bffd8b15a43e326a +size 14390 diff --git a/app/src/content/assets/finetasks/data/ar/community_arc_tur_cf:easy_stats.csv b/app/src/content/assets/finetasks/data/ar/community_arc_tur_cf:easy_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..81956a1d9a1a479f4a08e416d19f9a1978a1a2cc --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/community_arc_tur_cf:easy_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:ad546f00bd725bea998a5fc4c6a870f43a1a4e7457bda42b110096b13a029fd8 +size 449 diff --git a/app/src/content/assets/finetasks/data/ar/exams_ara_cf:_average_data.csv b/app/src/content/assets/finetasks/data/ar/exams_ara_cf:_average_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..c73c7e170711a614eb924ad05c1ddb39bc26bf64 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/exams_ara_cf:_average_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:af146e0409fc3332f8f250a36caabb270e01ad48ad5d04dd539de86bdc8529ff +size 36571 diff --git a/app/src/content/assets/finetasks/data/ar/exams_ara_cf:_average_stats.csv b/app/src/content/assets/finetasks/data/ar/exams_ara_cf:_average_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..5eba076d4c93f90d67657048f1c3651e3aaa08cd --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/exams_ara_cf:_average_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:9dbd86d8c6c5922af78b295bbc8c89f483049a17500be20cb625565a2e599242 +size 1717 diff --git a/app/src/content/assets/finetasks/data/ar/frenchbench_arc_fra_cf:challenge_data.csv b/app/src/content/assets/finetasks/data/ar/frenchbench_arc_fra_cf:challenge_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..7529a61c7e2a97605acd93e055286eadc50efb91 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/frenchbench_arc_fra_cf:challenge_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:2649f9802e39e19da555d2e42851281cca18826534d23246bffd8b15a43e326a +size 14390 diff --git a/app/src/content/assets/finetasks/data/ar/frenchbench_arc_fra_cf:challenge_stats.csv b/app/src/content/assets/finetasks/data/ar/frenchbench_arc_fra_cf:challenge_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..a9466be97f6b56964aa71026b43cd8c95b64f6ed --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/frenchbench_arc_fra_cf:challenge_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:2da7b002e63a530df289c706c5c56113d16427105d18cff9dc556f25feb7e5e5 +size 477 diff --git a/app/src/content/assets/finetasks/data/ar/mkqa_ara:_average_data.csv b/app/src/content/assets/finetasks/data/ar/mkqa_ara:_average_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..da9496f1ae3132151720db4e311718a5de8c0c9e --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/mkqa_ara:_average_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:9388c7fefc1e771f16b22c4dd7412112a7722428b92de4bce8518afe83103690 +size 16844 diff --git a/app/src/content/assets/finetasks/data/ar/mkqa_ara:_average_stats.csv b/app/src/content/assets/finetasks/data/ar/mkqa_ara:_average_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..6630061d596db75783169a00b4f655e715e09b59 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/mkqa_ara:_average_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:a0168960dbabff461f0ececcc1da6e71cefda60ce6d26c2aff5ed641fd293cf6 +size 495 diff --git a/app/src/content/assets/finetasks/data/ar/mlmm_arc_ara_cf:challenge_data.csv b/app/src/content/assets/finetasks/data/ar/mlmm_arc_ara_cf:challenge_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..d7b16e16c2257fec81755178a8fe5dcecce67067 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/mlmm_arc_ara_cf:challenge_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c76ad0e6107995e063aafa4d3873c92a778982fe701690fb5ad70de4bd64ef8a +size 17746 diff --git a/app/src/content/assets/finetasks/data/ar/mlmm_arc_ara_cf:challenge_stats.csv b/app/src/content/assets/finetasks/data/ar/mlmm_arc_ara_cf:challenge_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..9670a652167bb538cee8769e52289283beb98552 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/mlmm_arc_ara_cf:challenge_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:16cab1f8bd13af93e4344179376e299875b537dd357810c85498f379d12bb731 +size 913 diff --git a/app/src/content/assets/finetasks/data/ar/mlmm_arc_fra_cf:challenge_data.csv b/app/src/content/assets/finetasks/data/ar/mlmm_arc_fra_cf:challenge_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..7529a61c7e2a97605acd93e055286eadc50efb91 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/mlmm_arc_fra_cf:challenge_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:2649f9802e39e19da555d2e42851281cca18826534d23246bffd8b15a43e326a +size 14390 diff --git a/app/src/content/assets/finetasks/data/ar/mlmm_arc_fra_cf:challenge_stats.csv b/app/src/content/assets/finetasks/data/ar/mlmm_arc_fra_cf:challenge_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..78106f43cb9f3e1b482f960de71fc5a562e6b051 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/mlmm_arc_fra_cf:challenge_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:cfe502ab75d45a90a278416e7e8509615f985099462ce0befa67b63abe6c5c54 +size 449 diff --git a/app/src/content/assets/finetasks/data/ar/mlmm_arc_hin_cf:challenge_data.csv b/app/src/content/assets/finetasks/data/ar/mlmm_arc_hin_cf:challenge_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..7529a61c7e2a97605acd93e055286eadc50efb91 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/mlmm_arc_hin_cf:challenge_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:2649f9802e39e19da555d2e42851281cca18826534d23246bffd8b15a43e326a +size 14390 diff --git a/app/src/content/assets/finetasks/data/ar/mlmm_arc_hin_cf:challenge_stats.csv b/app/src/content/assets/finetasks/data/ar/mlmm_arc_hin_cf:challenge_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..52183321ada540ac07888d5c108576197859ddea --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/mlmm_arc_hin_cf:challenge_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:60c4097020b7c67df542eae690373fd1a9d8a4dd15ea1e194c58df03bd1b8e44 +size 449 diff --git a/app/src/content/assets/finetasks/data/ar/mlmm_arc_rus_cf:challenge_data.csv b/app/src/content/assets/finetasks/data/ar/mlmm_arc_rus_cf:challenge_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..7529a61c7e2a97605acd93e055286eadc50efb91 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/mlmm_arc_rus_cf:challenge_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:2649f9802e39e19da555d2e42851281cca18826534d23246bffd8b15a43e326a +size 14390 diff --git a/app/src/content/assets/finetasks/data/ar/mlmm_arc_rus_cf:challenge_stats.csv b/app/src/content/assets/finetasks/data/ar/mlmm_arc_rus_cf:challenge_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..4b84e3c93b204ce44b2f929a45951487113c19f1 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/mlmm_arc_rus_cf:challenge_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:2f80cb3558744a053924de7e22cae99390b8e306f5ea637093a32e12ac84e868 +size 449 diff --git a/app/src/content/assets/finetasks/data/ar/mlmm_arc_tel_cf:challenge_data.csv b/app/src/content/assets/finetasks/data/ar/mlmm_arc_tel_cf:challenge_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..7529a61c7e2a97605acd93e055286eadc50efb91 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/mlmm_arc_tel_cf:challenge_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:2649f9802e39e19da555d2e42851281cca18826534d23246bffd8b15a43e326a +size 14390 diff --git a/app/src/content/assets/finetasks/data/ar/mlmm_arc_tel_cf:challenge_stats.csv b/app/src/content/assets/finetasks/data/ar/mlmm_arc_tel_cf:challenge_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..7565935f46cbe0049869bb88b3d99357b4a13c29 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/mlmm_arc_tel_cf:challenge_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:ab50b6bcca97ef16c74814dc339f6b105f743a154febd41037174db29f057c51 +size 449 diff --git a/app/src/content/assets/finetasks/data/ar/mlmm_arc_zho_cf:challenge_data.csv b/app/src/content/assets/finetasks/data/ar/mlmm_arc_zho_cf:challenge_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..7529a61c7e2a97605acd93e055286eadc50efb91 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/mlmm_arc_zho_cf:challenge_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:2649f9802e39e19da555d2e42851281cca18826534d23246bffd8b15a43e326a +size 14390 diff --git a/app/src/content/assets/finetasks/data/ar/mlmm_arc_zho_cf:challenge_stats.csv b/app/src/content/assets/finetasks/data/ar/mlmm_arc_zho_cf:challenge_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..ab42af45ec49335e7d2551fc09dbcce59a44e921 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/mlmm_arc_zho_cf:challenge_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:6d049f48f1e42ccb713b36bb55040348b3fb21a373aaf939b57ad018eec87ec1 +size 449 diff --git a/app/src/content/assets/finetasks/data/ar/mlmm_hellaswag_ara_cf_data.csv b/app/src/content/assets/finetasks/data/ar/mlmm_hellaswag_ara_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..c477b8506e45d28a2078b332a638e0551529b5e8 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/mlmm_hellaswag_ara_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:355137ede77c972f37f8270a642dc943bc5a3bc7ebbc4e7559bddf8c941b6238 +size 17845 diff --git a/app/src/content/assets/finetasks/data/ar/mlmm_hellaswag_ara_cf_stats.csv b/app/src/content/assets/finetasks/data/ar/mlmm_hellaswag_ara_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..df93ea17231869640d3bb6e3e9b077c4a6a77223 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/mlmm_hellaswag_ara_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:31d5cf55ed57addb77dbad1b591b616e05a7497c4a8327d88e50c1712872da0d +size 901 diff --git a/app/src/content/assets/finetasks/data/ar/mlmm_mmlu_ara_cf:_average_data.csv b/app/src/content/assets/finetasks/data/ar/mlmm_mmlu_ara_cf:_average_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..d0e2430c9c3b133df747c01290d26f152973325a --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/mlmm_mmlu_ara_cf:_average_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:76899f7deceade79e6c90dd7a3fc44ed1da1517938fa09ffe6e6e9d980f03bf4 +size 23216 diff --git a/app/src/content/assets/finetasks/data/ar/mlmm_mmlu_ara_cf:_average_stats.csv b/app/src/content/assets/finetasks/data/ar/mlmm_mmlu_ara_cf:_average_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..8ea81a548291c6a7cbb6a92dd961d637222e1211 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/mlmm_mmlu_ara_cf:_average_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:10a88157f7c7dea32db2d984bc9e0145d3a230c274ba67fc4d5d07712e981749 +size 932 diff --git a/app/src/content/assets/finetasks/data/ar/mlmm_truthfulqa_ara_cf:mc1_data.csv b/app/src/content/assets/finetasks/data/ar/mlmm_truthfulqa_ara_cf:mc1_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..70e9142ca23f3bf66cfb081ff05a82b6bd62d32e --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/mlmm_truthfulqa_ara_cf:mc1_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:67e450b247f2bf9484d31ce547d90595bcf446a99c119b702e8c985ab6d140c3 +size 23840 diff --git a/app/src/content/assets/finetasks/data/ar/mlmm_truthfulqa_ara_cf:mc1_stats.csv b/app/src/content/assets/finetasks/data/ar/mlmm_truthfulqa_ara_cf:mc1_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..ec500f9861e666386af1ac59634a87b7a3ad6f16 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/mlmm_truthfulqa_ara_cf:mc1_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:4aa51d97695e792a62dae8b37786fc4bf731390faf724186181c7bb2e3010b30 +size 916 diff --git a/app/src/content/assets/finetasks/data/ar/mlmm_truthfulqa_ara_cf:mc2_data.csv b/app/src/content/assets/finetasks/data/ar/mlmm_truthfulqa_ara_cf:mc2_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..f7182d22cf5b9cae0460789cd26008333c9a82aa --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/mlmm_truthfulqa_ara_cf:mc2_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:33878fcc6a2e346525c4da2a33b80c095198d6afceee732ba67690970f47deca +size 24686 diff --git a/app/src/content/assets/finetasks/data/ar/mlmm_truthfulqa_ara_cf:mc2_stats.csv b/app/src/content/assets/finetasks/data/ar/mlmm_truthfulqa_ara_cf:mc2_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..6844804a1edea2dfb56dcff586bcea788237dee8 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/mlmm_truthfulqa_ara_cf:mc2_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:4553563443b892e7f6aae73be0126419b2741e3798e782bf666d61c1ea38e4e0 +size 935 diff --git a/app/src/content/assets/finetasks/data/ar/mlqa_ara_data.csv b/app/src/content/assets/finetasks/data/ar/mlqa_ara_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..6bc2ffbd37d1a4237a44bb5a33664f0964876754 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/mlqa_ara_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:924e3dcf6c2ffc5d1f23fe52346eec588b98b06044cad0e52dfc98ae1f4141d4 +size 25721 diff --git a/app/src/content/assets/finetasks/data/ar/mlqa_ara_stats.csv b/app/src/content/assets/finetasks/data/ar/mlqa_ara_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..553a75947fcc156ed5f57f6e4f8959db916e1a49 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/mlqa_ara_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:6eb4f7031db6d8aa8d3e954acd1ba2237b67737c358f5990b4880e63ccf7de56 +size 1267 diff --git a/app/src/content/assets/finetasks/data/ar/mmlu_ara_cf:_average_data.csv b/app/src/content/assets/finetasks/data/ar/mmlu_ara_cf:_average_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..a9b71dea55cf2b504f5617695d41025bc13dcfbb --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/mmlu_ara_cf:_average_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:66da43aa5e53ba73bf69d09fd3ea9d6b5b5ae7900100b4be4d58afb5010ef17e +size 49243 diff --git a/app/src/content/assets/finetasks/data/ar/mmlu_ara_cf:_average_stats.csv b/app/src/content/assets/finetasks/data/ar/mmlu_ara_cf:_average_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..df5851f14c078e9da1a0ca76f9d2f6a19fbe75c6 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/mmlu_ara_cf:_average_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:257196b39158bb53528ffe4f51bb53df96c80ee647cc60e3bbebca98dbbe4e75 +size 2549 diff --git a/app/src/content/assets/finetasks/data/ar/soqal_ara_cf_data.csv b/app/src/content/assets/finetasks/data/ar/soqal_ara_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..df44b118f1c4bbf1ba1717c7df053e8c757d85cf --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/soqal_ara_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:8cd119fff381c5f57733187123c8e69584513ed4350dbb800a39eb54b8702f51 +size 21365 diff --git a/app/src/content/assets/finetasks/data/ar/soqal_ara_cf_stats.csv b/app/src/content/assets/finetasks/data/ar/soqal_ara_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..bbc47ee18290e42f92a8f6e58215156fbfe0bd96 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/soqal_ara_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:744e8dd10c041b23a117366edcaf9220d5ac39ae91831285ad2ceb2524a0ff42 +size 864 diff --git a/app/src/content/assets/finetasks/data/ar/toxigen_ara_cf_data.csv b/app/src/content/assets/finetasks/data/ar/toxigen_ara_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..5addc61ff35729f0918e4c7d437ddf081b4c8ca5 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/toxigen_ara_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:f34797f85e13cae143cc51bb3c3c907d00a15c2fe137ddbd554041d05fc3b9d6 +size 23014 diff --git a/app/src/content/assets/finetasks/data/ar/toxigen_ara_cf_stats.csv b/app/src/content/assets/finetasks/data/ar/toxigen_ara_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..b5bd45e775eb35505dbcf12a75d52924d9f5ce8f --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/toxigen_ara_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:ad29ef07e8b1e04dad7b1ce1d59810ad0d7c3b126153183153144c0ad2fe6c15 +size 789 diff --git a/app/src/content/assets/finetasks/data/ar/tydiqa_ara_data.csv b/app/src/content/assets/finetasks/data/ar/tydiqa_ara_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..24ed3024854fea76f896a9bd0c1d25e896249274 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/tydiqa_ara_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:71197a077abe7df2b5b8b6ca9b954b613a2455702ce25b62634f92bbe4ca39fb +size 16876 diff --git a/app/src/content/assets/finetasks/data/ar/tydiqa_ara_stats.csv b/app/src/content/assets/finetasks/data/ar/tydiqa_ara_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..9dd8305507dab577e96b94ec02fff6d1714569d8 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/tydiqa_ara_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:2a8236b30a8ef64428addb99b8d012823751705e98872dbd303ea08fc823dd9c +size 477 diff --git a/app/src/content/assets/finetasks/data/ar/xcodah_ara_cf_data.csv b/app/src/content/assets/finetasks/data/ar/xcodah_ara_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..2b89ee637447e5c04f7aa0cab9241df8fcdbf7c1 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/xcodah_ara_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:282438f07dd6efb8a326f956d3fbd0e75ed715e1212c0a49dca1e19e27077651 +size 21803 diff --git a/app/src/content/assets/finetasks/data/ar/xcodah_ara_cf_stats.csv b/app/src/content/assets/finetasks/data/ar/xcodah_ara_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..55163dd96f332a5db0d1dd031e19d47e39d43e3d --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/xcodah_ara_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:76c80d7b9ab99dcd4783a908bbaf3fdf518f5ea86635e52df17d502f77545d7a +size 870 diff --git a/app/src/content/assets/finetasks/data/ar/xcopa_ara_cf_data.csv b/app/src/content/assets/finetasks/data/ar/xcopa_ara_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..c6b134f63fb6bf53989c26472e98119c314aad01 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/xcopa_ara_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:d5751498ba2dd1e00ba278113a7a006080039ec6fbd9ac410b232b18c5d258f9 +size 23030 diff --git a/app/src/content/assets/finetasks/data/ar/xcopa_ara_cf_stats.csv b/app/src/content/assets/finetasks/data/ar/xcopa_ara_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..6ab17e11e48fcdeaeec86e424bc15beb4ade2d25 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/xcopa_ara_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:f5b7820d0b7506baeb46a22355cd71711ae75923f4c38f03313edb3530498bf1 +size 871 diff --git a/app/src/content/assets/finetasks/data/ar/xcsqa_ara_cf_data.csv b/app/src/content/assets/finetasks/data/ar/xcsqa_ara_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..0bc9d8beb0d74c3e11d9d5f7720360bd07ba8bc1 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/xcsqa_ara_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:70bf3e3881cac606f7e7aa07beadb2ed971d6f1c4d01787a715edef4a899553f +size 18041 diff --git a/app/src/content/assets/finetasks/data/ar/xcsqa_ara_cf_stats.csv b/app/src/content/assets/finetasks/data/ar/xcsqa_ara_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..af75523172f66638982ef8abbf63019b10e4cae5 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/xcsqa_ara_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:68d907b72bda3faba663edfb8ca5cd7b3256a34828eaa7bb893d3fa22239b5f0 +size 868 diff --git a/app/src/content/assets/finetasks/data/ar/xnli2.0_ara_cf_data.csv b/app/src/content/assets/finetasks/data/ar/xnli2.0_ara_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..37b420206b4444adfe37364b77f0f8ad45fe2673 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/xnli2.0_ara_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:b240be4551225f27b28262974392ad3327de42548a9ad2f85b3bdc910d018aeb +size 17454 diff --git a/app/src/content/assets/finetasks/data/ar/xnli2.0_ara_cf_stats.csv b/app/src/content/assets/finetasks/data/ar/xnli2.0_ara_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..b95dca485374ac617562f4f9f5a4a2ebf0a1b779 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/xnli2.0_ara_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1edf99839eb4a763d4c7b1dad0544fb88ff1e5bede190dcdc33a3b019cb1e9e9 +size 877 diff --git a/app/src/content/assets/finetasks/data/ar/xnli_ara_cf_data.csv b/app/src/content/assets/finetasks/data/ar/xnli_ara_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..8429dbc9791784f1fbf1e98b658030447a75f1e5 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/xnli_ara_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:ee281131ccb32a25ac6dfda6e601d56dab4512a6e779cca5c8c2c37565cbb566 +size 17155 diff --git a/app/src/content/assets/finetasks/data/ar/xnli_ara_cf_stats.csv b/app/src/content/assets/finetasks/data/ar/xnli_ara_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..4c0b433ab1596fc917c3c5fdcf582faf7d628fe2 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/xnli_ara_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:9f4899dfa69aacbcb7c06935f1d227ed46161025832d9fd134fe5f2552cdc1e0 +size 867 diff --git a/app/src/content/assets/finetasks/data/ar/xquad_ara_data.csv b/app/src/content/assets/finetasks/data/ar/xquad_ara_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..70e5e76f542012e538648f0f85ffb4216fc9ff14 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/xquad_ara_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:643414c6856396e285ad1e8f558c9bcadc4389cbd37c842c2915b5f29a3cb964 +size 15388 diff --git a/app/src/content/assets/finetasks/data/ar/xquad_ara_stats.csv b/app/src/content/assets/finetasks/data/ar/xquad_ara_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..385bc75703d961d6dd1e213128565b22c48a079d --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/xquad_ara_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:ca9e2c88a0240ef8289112e61fa8f900ade501701fd98c8a3a5d1048bcfba2c3 +size 466 diff --git a/app/src/content/assets/finetasks/data/ar/xstory_cloze_ara_cf_data.csv b/app/src/content/assets/finetasks/data/ar/xstory_cloze_ara_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..9059421ce46605e33939aa90ab7253fa0dc57373 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/xstory_cloze_ara_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:2dd0bd6f63c23700e35ee10c80d3f27208d874f61cea28b5c2633bd2d38229b5 +size 17470 diff --git a/app/src/content/assets/finetasks/data/ar/xstory_cloze_ara_cf_stats.csv b/app/src/content/assets/finetasks/data/ar/xstory_cloze_ara_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..7d704ee37f9f3c01bf183f75a01f53f5b65e3c76 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ar/xstory_cloze_ara_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:f14c2b687dd834ec027757630507bd6c8641cec5a1d1632a4849f3dc087b02f5 +size 882 diff --git a/app/src/content/assets/finetasks/data/fr/belebele_fra_Latn_cf_data.csv b/app/src/content/assets/finetasks/data/fr/belebele_fra_Latn_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..26aff24a99045af241a611ae86bde72a6611a74a --- /dev/null +++ b/app/src/content/assets/finetasks/data/fr/belebele_fra_Latn_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:f8bf74303625348050907ec1257eba1cc8bc0bba774529469222193a818ee363 +size 21155 diff --git a/app/src/content/assets/finetasks/data/fr/belebele_fra_Latn_cf_stats.csv b/app/src/content/assets/finetasks/data/fr/belebele_fra_Latn_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..0dd1621d533d36ee6dc7ec26fbcf10aa7568b1c5 --- /dev/null +++ b/app/src/content/assets/finetasks/data/fr/belebele_fra_Latn_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:6ea6befd543af22f25a0c49573194d93b3e7c37066137d2f4b2640f6a6688e4c +size 778 diff --git a/app/src/content/assets/finetasks/data/fr/community_boolq_fra_cf_data.csv b/app/src/content/assets/finetasks/data/fr/community_boolq_fra_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..df8f421d9b54a04276d4aa3d2ab3a5e0efca23e2 --- /dev/null +++ b/app/src/content/assets/finetasks/data/fr/community_boolq_fra_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:ea0ad0a0d568c73273518c1a9e6b900085be2d5479d4a9c791619edd2c8137ef +size 25646 diff --git a/app/src/content/assets/finetasks/data/fr/community_boolq_fra_cf_stats.csv b/app/src/content/assets/finetasks/data/fr/community_boolq_fra_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..2f78cb50df570bd2a0a65d25653745b545116ed0 --- /dev/null +++ b/app/src/content/assets/finetasks/data/fr/community_boolq_fra_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7e013ec50b70086aa8b917c153775824083f11fcca1df0296d5780def4305381 +size 1123 diff --git a/app/src/content/assets/finetasks/data/fr/exams_fra_cf:_average_data.csv b/app/src/content/assets/finetasks/data/fr/exams_fra_cf:_average_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..75e16510c246fcf969c6edb6f3a132c85dc18aeb --- /dev/null +++ b/app/src/content/assets/finetasks/data/fr/exams_fra_cf:_average_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:ba8065f8456e8667fc1c00ffe79aa9714c92e1f470a3920463bb5597138ec4dd +size 24260 diff --git a/app/src/content/assets/finetasks/data/fr/exams_fra_cf:_average_stats.csv b/app/src/content/assets/finetasks/data/fr/exams_fra_cf:_average_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..8d5960a531a1c2ca3fae32b5e19d81c2a9108154 --- /dev/null +++ b/app/src/content/assets/finetasks/data/fr/exams_fra_cf:_average_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:af24c7f12821a26d125859d1dd4046630836c4e9b687ca53ed824015286fbf7c +size 900 diff --git a/app/src/content/assets/finetasks/data/fr/fquadv2_fra_data.csv b/app/src/content/assets/finetasks/data/fr/fquadv2_fra_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..c0a95c7df9a4a45b4a3afa77709fbcccbcd8a0c6 --- /dev/null +++ b/app/src/content/assets/finetasks/data/fr/fquadv2_fra_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:524490dcca0a6870937794b46b9c3808a951e1e13088a2667ae6fcc320e90c7d +size 16139 diff --git a/app/src/content/assets/finetasks/data/fr/fquadv2_fra_stats.csv b/app/src/content/assets/finetasks/data/fr/fquadv2_fra_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..503a77c132d629aa04dadc715fee6f68f53ccca0 --- /dev/null +++ b/app/src/content/assets/finetasks/data/fr/fquadv2_fra_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:01b20bc5fb7a0e069f92cb67d44153053b4c734b7fa2205418b4638d6a26d6da +size 487 diff --git a/app/src/content/assets/finetasks/data/fr/frenchbench_arc_fra_cf:challenge_data.csv b/app/src/content/assets/finetasks/data/fr/frenchbench_arc_fra_cf:challenge_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..9c39071bae9dbb47fe553155823d628da8c58c4a --- /dev/null +++ b/app/src/content/assets/finetasks/data/fr/frenchbench_arc_fra_cf:challenge_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:9bcecdc5f66facef542aad5a0979821268341d4d14818bf11f29ab35e28a3af8 +size 17725 diff --git a/app/src/content/assets/finetasks/data/fr/frenchbench_arc_fra_cf:challenge_stats.csv b/app/src/content/assets/finetasks/data/fr/frenchbench_arc_fra_cf:challenge_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..b3e7e011145baad6b9b0b091371f899ea7feea15 --- /dev/null +++ b/app/src/content/assets/finetasks/data/fr/frenchbench_arc_fra_cf:challenge_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:4128610f777512f5aab2ba8f38dd64b0148a1fc6c4b21eb652e273fe803884e5 +size 954 diff --git a/app/src/content/assets/finetasks/data/fr/frenchbench_hellaswag_fra_cf_data.csv b/app/src/content/assets/finetasks/data/fr/frenchbench_hellaswag_fra_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..cf78f03cb4999044258e3c76cdc9a5eb23b5b3a3 --- /dev/null +++ b/app/src/content/assets/finetasks/data/fr/frenchbench_hellaswag_fra_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:987edfd6ed20f7efad5172506cdc3041fffc93dd9b2f6808cc03b38b755e2bc4 +size 18164 diff --git a/app/src/content/assets/finetasks/data/fr/frenchbench_hellaswag_fra_cf_stats.csv b/app/src/content/assets/finetasks/data/fr/frenchbench_hellaswag_fra_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..7c843691931b3044c6cbaf2f0533c107a9f1902e --- /dev/null +++ b/app/src/content/assets/finetasks/data/fr/frenchbench_hellaswag_fra_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:6631d35920ce2a9ed20a08f92ac44ade15d53d014dd24db6c01719bd74decaeb +size 932 diff --git a/app/src/content/assets/finetasks/data/fr/frenchbench_triviaqa_fra_cf_data.csv b/app/src/content/assets/finetasks/data/fr/frenchbench_triviaqa_fra_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..dd7fad2cc79a836831c67299015abff95b7c9aa0 --- /dev/null +++ b/app/src/content/assets/finetasks/data/fr/frenchbench_triviaqa_fra_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:444ea95c60415862e4dc4ce8c5adf0167c2d17b8a2ec8386f1768be20531e8ea +size 11569 diff --git a/app/src/content/assets/finetasks/data/fr/frenchbench_triviaqa_fra_cf_stats.csv b/app/src/content/assets/finetasks/data/fr/frenchbench_triviaqa_fra_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..947799160bae6e8729561a68e148e9448acc54dd --- /dev/null +++ b/app/src/content/assets/finetasks/data/fr/frenchbench_triviaqa_fra_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:001b9f1cb583f9ca1f6b688d5b389dcb395903b75f8fbdb04973530d3f12887c +size 280 diff --git a/app/src/content/assets/finetasks/data/fr/meta_mmlu_fra_cf:_average_data.csv b/app/src/content/assets/finetasks/data/fr/meta_mmlu_fra_cf:_average_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..d36f90fb1ab9fed6e15594493e8e0ee217ac8d59 --- /dev/null +++ b/app/src/content/assets/finetasks/data/fr/meta_mmlu_fra_cf:_average_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:d26a80ff893d40b7023cbf5cb3c988222626957a5e137a81dfa6f9628f8dbc03 +size 23998 diff --git a/app/src/content/assets/finetasks/data/fr/meta_mmlu_fra_cf:_average_stats.csv b/app/src/content/assets/finetasks/data/fr/meta_mmlu_fra_cf:_average_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..870712e1dc1923002cc4bb9f04dde2339ef5eeae --- /dev/null +++ b/app/src/content/assets/finetasks/data/fr/meta_mmlu_fra_cf:_average_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:529bd7ba13e53584db777ae8c3d8fdce18d539888ec618b08f70a23b886c0aec +size 927 diff --git a/app/src/content/assets/finetasks/data/fr/mintaka_fra_data.csv b/app/src/content/assets/finetasks/data/fr/mintaka_fra_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..127358d397b95ec58d474a21fe9e9f70b462a66f --- /dev/null +++ b/app/src/content/assets/finetasks/data/fr/mintaka_fra_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:6017c15693179e140ed017cae687ce7d88d924fce67578d9e27cf31a7a43f34b +size 16290 diff --git a/app/src/content/assets/finetasks/data/fr/mintaka_fra_stats.csv b/app/src/content/assets/finetasks/data/fr/mintaka_fra_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..045b5be31ea4baf36b453a004c9a1d25462d7569 --- /dev/null +++ b/app/src/content/assets/finetasks/data/fr/mintaka_fra_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7858fceedf49415a2bdd7b63b4d7bf1173c54d5cc92bf0076f8ea650f416de9b +size 474 diff --git a/app/src/content/assets/finetasks/data/fr/mkqa_fra:_average_data.csv b/app/src/content/assets/finetasks/data/fr/mkqa_fra:_average_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..54996ba3bee297a329172d39e2bcab39fd0e0a4a --- /dev/null +++ b/app/src/content/assets/finetasks/data/fr/mkqa_fra:_average_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:ffe31d588711ffe7a4c361ad5136e844a0b5c2bdf0acc699134c80851ad242a1 +size 17369 diff --git a/app/src/content/assets/finetasks/data/fr/mkqa_fra:_average_stats.csv b/app/src/content/assets/finetasks/data/fr/mkqa_fra:_average_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..8f8a29779f302e4e3a0a4d3681056995b388ec4f --- /dev/null +++ b/app/src/content/assets/finetasks/data/fr/mkqa_fra:_average_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7f37387dcf678baa6c6b3c5ee0033a0e52dcab8a185956b45123be50d5be887f +size 500 diff --git a/app/src/content/assets/finetasks/data/fr/mlmm_arc_fra_cf:challenge_data.csv b/app/src/content/assets/finetasks/data/fr/mlmm_arc_fra_cf:challenge_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..3b8f92222e19fdee61e1cbdacc4cebb8480aaec3 --- /dev/null +++ b/app/src/content/assets/finetasks/data/fr/mlmm_arc_fra_cf:challenge_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:94e9b35958ccb9cb9253bf9e137452c191c8ec534821d9df48a1f87a4df74a3e +size 17759 diff --git a/app/src/content/assets/finetasks/data/fr/mlmm_arc_fra_cf:challenge_stats.csv b/app/src/content/assets/finetasks/data/fr/mlmm_arc_fra_cf:challenge_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..efe2e59fe5b34b875075e18f03026ff82838380c --- /dev/null +++ b/app/src/content/assets/finetasks/data/fr/mlmm_arc_fra_cf:challenge_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:f604ff548b2e68e4a3e534e9289a4e79cbd0008b09e4b76c336465f0af36f507 +size 924 diff --git a/app/src/content/assets/finetasks/data/fr/mlmm_hellaswag_fra_cf_data.csv b/app/src/content/assets/finetasks/data/fr/mlmm_hellaswag_fra_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..47075a5551c30a64a5d9f06e4bcd878c765e37d4 --- /dev/null +++ b/app/src/content/assets/finetasks/data/fr/mlmm_hellaswag_fra_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:f9538ea4b0e49569dc76d585c52c4c1cba80861e533d0161b7ab4cc3e82ba7ae +size 18114 diff --git a/app/src/content/assets/finetasks/data/fr/mlmm_hellaswag_fra_cf_stats.csv b/app/src/content/assets/finetasks/data/fr/mlmm_hellaswag_fra_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..f949c3d4686842bae0b9697c7f2f2a73bf003119 --- /dev/null +++ b/app/src/content/assets/finetasks/data/fr/mlmm_hellaswag_fra_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:87a346743337d6b3b80134b663536dc304870d5da392c2b44dbf2b9ad8465771 +size 902 diff --git a/app/src/content/assets/finetasks/data/fr/mlmm_mmlu_fra_cf:_average_data.csv b/app/src/content/assets/finetasks/data/fr/mlmm_mmlu_fra_cf:_average_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..2f29a81088390803517931cc1514d2c40e4a55cc --- /dev/null +++ b/app/src/content/assets/finetasks/data/fr/mlmm_mmlu_fra_cf:_average_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:26febadea673536ceb027e122b86e36bbfd593a9c3c367a420658bdb6a0a1817 +size 24066 diff --git a/app/src/content/assets/finetasks/data/fr/mlmm_mmlu_fra_cf:_average_stats.csv b/app/src/content/assets/finetasks/data/fr/mlmm_mmlu_fra_cf:_average_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..d975f9f39342c4ca3c8149ffae3bf397b73ef249 --- /dev/null +++ b/app/src/content/assets/finetasks/data/fr/mlmm_mmlu_fra_cf:_average_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:88aa9342c41a6358efd3d7b720b5a0e57074298fba5680459370f961b17dc3ef +size 922 diff --git a/app/src/content/assets/finetasks/data/fr/mlmm_truthfulqa_fra_cf:mc1_data.csv b/app/src/content/assets/finetasks/data/fr/mlmm_truthfulqa_fra_cf:mc1_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..2e3ce67871226e7f6601a98ce50dab14062bb677 --- /dev/null +++ b/app/src/content/assets/finetasks/data/fr/mlmm_truthfulqa_fra_cf:mc1_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:76c94bd6e0fcc000479256238a0f374a1b0cc2d7d6ef1407d04f66fb07a08677 +size 24136 diff --git a/app/src/content/assets/finetasks/data/fr/mlmm_truthfulqa_fra_cf:mc1_stats.csv b/app/src/content/assets/finetasks/data/fr/mlmm_truthfulqa_fra_cf:mc1_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..014b789d10deb6051f09386ccd95b7f73509454e --- /dev/null +++ b/app/src/content/assets/finetasks/data/fr/mlmm_truthfulqa_fra_cf:mc1_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:33460c685b7b3c5abe4ef1bcc8b6ec137bb63f6adfe04dd35bbf129cb35a70cf +size 911 diff --git a/app/src/content/assets/finetasks/data/fr/mlmm_truthfulqa_fra_cf:mc2_data.csv b/app/src/content/assets/finetasks/data/fr/mlmm_truthfulqa_fra_cf:mc2_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..195e84ff56ad6fbb28789ed220fb37ff55b0b6a4 --- /dev/null +++ b/app/src/content/assets/finetasks/data/fr/mlmm_truthfulqa_fra_cf:mc2_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:75d9e8d6fb5c0c7ab944374f11e530e1e4e83b340e2f2fe0c1e58ca7f615401b +size 24666 diff --git a/app/src/content/assets/finetasks/data/fr/mlmm_truthfulqa_fra_cf:mc2_stats.csv b/app/src/content/assets/finetasks/data/fr/mlmm_truthfulqa_fra_cf:mc2_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..78768d39cc8498fa784683a90e703369ff0f9667 --- /dev/null +++ b/app/src/content/assets/finetasks/data/fr/mlmm_truthfulqa_fra_cf:mc2_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:932b29b9644c1478e5e613fe2e536bbfbe6e8c4123921a29aa97d03c4094e006 +size 933 diff --git a/app/src/content/assets/finetasks/data/fr/pawsx_fra_cf_data.csv b/app/src/content/assets/finetasks/data/fr/pawsx_fra_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..05be5ba5e8962b2062dc73acd334e3658d76ffb6 --- /dev/null +++ b/app/src/content/assets/finetasks/data/fr/pawsx_fra_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:69559034931e774864f0fa7bbf26e4b9270c8bf31338ae73b16ce19634adbdcc +size 17511 diff --git a/app/src/content/assets/finetasks/data/fr/pawsx_fra_cf_stats.csv b/app/src/content/assets/finetasks/data/fr/pawsx_fra_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..bc136991685db99769aad52106d3dfd7491821bb --- /dev/null +++ b/app/src/content/assets/finetasks/data/fr/pawsx_fra_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:f07068ad7adf847efb1f476dc110f34d48c9854b41bc625cae75e8cd488b2c39 +size 873 diff --git a/app/src/content/assets/finetasks/data/fr/xcodah_fra_cf_data.csv b/app/src/content/assets/finetasks/data/fr/xcodah_fra_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..0e12d6fdd2ccf2fb7701414da45db7c450857f67 --- /dev/null +++ b/app/src/content/assets/finetasks/data/fr/xcodah_fra_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:e220c63592209e00f2fba6ad62d7b98d5cad6a793258371af93930202f0a6c04 +size 21876 diff --git a/app/src/content/assets/finetasks/data/fr/xcodah_fra_cf_stats.csv b/app/src/content/assets/finetasks/data/fr/xcodah_fra_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..1515254cc1da8c08bbe60b4f47f4335b95ed75ea --- /dev/null +++ b/app/src/content/assets/finetasks/data/fr/xcodah_fra_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:ef7952ed00d1a8e3e0fbb3e9408373fd26b6815815b46d7d31976c2980bc8dd5 +size 872 diff --git a/app/src/content/assets/finetasks/data/fr/xcsqa_fra_cf_data.csv b/app/src/content/assets/finetasks/data/fr/xcsqa_fra_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..f0b5d4876a05f2f8d142e16864305e8e1a317471 --- /dev/null +++ b/app/src/content/assets/finetasks/data/fr/xcsqa_fra_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:b31f7a5232cb5137349fe5dcf843b047faaf6283a7107c564291a55e89e48123 +size 18021 diff --git a/app/src/content/assets/finetasks/data/fr/xcsqa_fra_cf_stats.csv b/app/src/content/assets/finetasks/data/fr/xcsqa_fra_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..1eb73d2b4258bebb5a43de084d32686d0e38a1c8 --- /dev/null +++ b/app/src/content/assets/finetasks/data/fr/xcsqa_fra_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:3f1b539390a64d4a15a819779231293d3013329fa66f43fee09dae6fccacf89b +size 868 diff --git a/app/src/content/assets/finetasks/data/fr/xnli2.0_fra_cf_data.csv b/app/src/content/assets/finetasks/data/fr/xnli2.0_fra_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..7925d7fc43b5a1b01bf8931417983401a5d8ab66 --- /dev/null +++ b/app/src/content/assets/finetasks/data/fr/xnli2.0_fra_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:5d7c666db7b11f37ffeaef0f97482551ca49cf3d80714998c6afcd55d94a4662 +size 18035 diff --git a/app/src/content/assets/finetasks/data/fr/xnli2.0_fra_cf_stats.csv b/app/src/content/assets/finetasks/data/fr/xnli2.0_fra_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..fbb7c4493ed4a69b67098c283c988e85b91c172a --- /dev/null +++ b/app/src/content/assets/finetasks/data/fr/xnli2.0_fra_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:75be52724634fe9868898fb8f5640d490d22c238548c94eff4104299c0aeb56d +size 876 diff --git a/app/src/content/assets/finetasks/data/fr/xnli_fra_cf_data.csv b/app/src/content/assets/finetasks/data/fr/xnli_fra_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..b56f69a590034cc14a027ce7f33d36e3ef800f02 --- /dev/null +++ b/app/src/content/assets/finetasks/data/fr/xnli_fra_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:59e518c79ca796ee31e27a7f2ad5ae3be98c6a4e38961fcd55c29d4c047c2a5a +size 16904 diff --git a/app/src/content/assets/finetasks/data/fr/xnli_fra_cf_stats.csv b/app/src/content/assets/finetasks/data/fr/xnli_fra_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..04e136d82f34c6e340edab665b74dc132ce0ed21 --- /dev/null +++ b/app/src/content/assets/finetasks/data/fr/xnli_fra_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:6a2b997e985286aaba6171ad16d3831e89abba77360743c4f82962f58418634e +size 854 diff --git a/app/src/content/assets/finetasks/data/fr/xwinograd_fra_cf_data.csv b/app/src/content/assets/finetasks/data/fr/xwinograd_fra_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..29e1aadde8715bfe75449dfabd1e00b8e8ff29c5 --- /dev/null +++ b/app/src/content/assets/finetasks/data/fr/xwinograd_fra_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:53d5cc5ceabd0f0860b307a93024bd31e50803597cd8d9c24406e483e8ad4c59 +size 24320 diff --git a/app/src/content/assets/finetasks/data/fr/xwinograd_fra_cf_stats.csv b/app/src/content/assets/finetasks/data/fr/xwinograd_fra_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..62738065fc30d8ef28ed93aa8e8dd347b68c3b42 --- /dev/null +++ b/app/src/content/assets/finetasks/data/fr/xwinograd_fra_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:68c7ac3ea428ac9054d11b8b089e0d9c2546ad0b89c76956ddb53af3d7e8b319 +size 884 diff --git a/app/src/content/assets/finetasks/data/hi/belebele_hin_Deva_cf_data.csv b/app/src/content/assets/finetasks/data/hi/belebele_hin_Deva_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..d2bd12e17c6dc7511f4726021667f044daf962e8 --- /dev/null +++ b/app/src/content/assets/finetasks/data/hi/belebele_hin_Deva_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:549a21ca19b3cea36e39b48caf4537533135161b319f2c032772a6bd29075692 +size 15778 diff --git a/app/src/content/assets/finetasks/data/hi/belebele_hin_Deva_cf_stats.csv b/app/src/content/assets/finetasks/data/hi/belebele_hin_Deva_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..714f4b6273790cd2db1c1b0ecc2d5c8038ea799f --- /dev/null +++ b/app/src/content/assets/finetasks/data/hi/belebele_hin_Deva_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:3f381be7b616e497efb938d669eaf2ecfcc669117057b8651a2f118fd3efd64a +size 749 diff --git a/app/src/content/assets/finetasks/data/hi/chai_hin_data.csv b/app/src/content/assets/finetasks/data/hi/chai_hin_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..c25fb8fb26068a05d1b48f51529057269ba56589 --- /dev/null +++ b/app/src/content/assets/finetasks/data/hi/chai_hin_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:aecd5852fbce76b4a7803ed86f831aa27f30a6de79b7734fcd6cb45978f76deb +size 13042 diff --git a/app/src/content/assets/finetasks/data/hi/chai_hin_stats.csv b/app/src/content/assets/finetasks/data/hi/chai_hin_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..79d675a1fea29393d30fec4e75e3b48991ec85e3 --- /dev/null +++ b/app/src/content/assets/finetasks/data/hi/chai_hin_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:fb9c8b9d4c08fe3844fbb855737a288668f5d1429d49ee62d4aeb9b6b4f078d7 +size 482 diff --git a/app/src/content/assets/finetasks/data/hi/community_arc_hin_cf:challenge_data.csv b/app/src/content/assets/finetasks/data/hi/community_arc_hin_cf:challenge_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..c15c90c58f9d556368a031deeb4aa3bbfe67084d --- /dev/null +++ b/app/src/content/assets/finetasks/data/hi/community_arc_hin_cf:challenge_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:adb4b9f3ee84559da3733e0c922cac72b92521551002ccfe52f542d812b76cdb +size 13976 diff --git a/app/src/content/assets/finetasks/data/hi/community_arc_hin_cf:challenge_stats.csv b/app/src/content/assets/finetasks/data/hi/community_arc_hin_cf:challenge_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..a38571b7a339e8119e06b412f90feb57f5475fbf --- /dev/null +++ b/app/src/content/assets/finetasks/data/hi/community_arc_hin_cf:challenge_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:5e3bd02f7e88588fde638d994b5a225f553fab9905fc9c4b720233beb36df6e1 +size 931 diff --git a/app/src/content/assets/finetasks/data/hi/community_arc_hin_cf:easy_data.csv b/app/src/content/assets/finetasks/data/hi/community_arc_hin_cf:easy_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..b864e436c7a3628dd899f157c87835d373a4b362 --- /dev/null +++ b/app/src/content/assets/finetasks/data/hi/community_arc_hin_cf:easy_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7608df069dfeeac48d939f83b0321dce2917d6867dbc110c4594546532a88c7e +size 13788 diff --git a/app/src/content/assets/finetasks/data/hi/community_arc_hin_cf:easy_stats.csv b/app/src/content/assets/finetasks/data/hi/community_arc_hin_cf:easy_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..161199519806b10c49fac1452e975b43f4b21dd0 --- /dev/null +++ b/app/src/content/assets/finetasks/data/hi/community_arc_hin_cf:easy_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:ab87618645ded975e70f336eb9d604f6b9ec3286acaffb519ae629e2d4b2706e +size 873 diff --git a/app/src/content/assets/finetasks/data/hi/community_boolq_hin_data.csv b/app/src/content/assets/finetasks/data/hi/community_boolq_hin_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..5c2c09c34ac633e416bd500bdfbedc310a812631 --- /dev/null +++ b/app/src/content/assets/finetasks/data/hi/community_boolq_hin_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:ef52c4c9d58374cab4ac420b1f426bf1d0b1269adcd9dfdfd1bd16306f07eb57 +size 14668 diff --git a/app/src/content/assets/finetasks/data/hi/community_boolq_hin_stats.csv b/app/src/content/assets/finetasks/data/hi/community_boolq_hin_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..8f8c802d6f988adb59647e1581fa73dcf3924ec7 --- /dev/null +++ b/app/src/content/assets/finetasks/data/hi/community_boolq_hin_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:928bdeee5eb72fc526436a21c8a2b9cffde2c79bf36bef5fd10ed8263ca6a1e0 +size 1069 diff --git a/app/src/content/assets/finetasks/data/hi/community_hellaswag_hin_cf_data.csv b/app/src/content/assets/finetasks/data/hi/community_hellaswag_hin_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..85747b2583d78bd7e0208cf00b625801855281c5 --- /dev/null +++ b/app/src/content/assets/finetasks/data/hi/community_hellaswag_hin_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:fdbd20ce949767054c6183671feef96b6921262784b77d9fdc1e8ccf0d2fa84d +size 13662 diff --git a/app/src/content/assets/finetasks/data/hi/community_hellaswag_hin_cf_stats.csv b/app/src/content/assets/finetasks/data/hi/community_hellaswag_hin_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..53e114f366e3cd371aee23173cd1558eff4b5255 --- /dev/null +++ b/app/src/content/assets/finetasks/data/hi/community_hellaswag_hin_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:e3a2a08dfcc0fd92cbb5973349d5d0f0a3770eca34c165bb4468eb924723aaad +size 923 diff --git a/app/src/content/assets/finetasks/data/hi/indicnxnli_hin_cf_data.csv b/app/src/content/assets/finetasks/data/hi/indicnxnli_hin_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..c14eb1c3a3cc279ef393e6c43272fcb1d4d919e5 --- /dev/null +++ b/app/src/content/assets/finetasks/data/hi/indicnxnli_hin_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:cc5a8f2f216887a159e5dfcfb7e458c99bce5713dfe8ac27df586c8ebb4fd088 +size 12786 diff --git a/app/src/content/assets/finetasks/data/hi/indicnxnli_hin_cf_stats.csv b/app/src/content/assets/finetasks/data/hi/indicnxnli_hin_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..611a3c2906af6ce2f2f7733bb899887cc51e8f51 --- /dev/null +++ b/app/src/content/assets/finetasks/data/hi/indicnxnli_hin_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:5f671450a27b062a59f7931ab0bfef750a06abb51d96b4e4cf976236c5d52416 +size 856 diff --git a/app/src/content/assets/finetasks/data/hi/indicqa_hin_data.csv b/app/src/content/assets/finetasks/data/hi/indicqa_hin_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..3984f68fd3d5f5d7539e56db9c8ddcfb1f163543 --- /dev/null +++ b/app/src/content/assets/finetasks/data/hi/indicqa_hin_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:b0548309e178a8fc1cd788536f743c964b47c54ae7a6d17f2b1d69bda72aaf33 +size 13067 diff --git a/app/src/content/assets/finetasks/data/hi/indicqa_hin_stats.csv b/app/src/content/assets/finetasks/data/hi/indicqa_hin_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..dffa00ecafdc937c6936ec7842a8225c6239ec4a --- /dev/null +++ b/app/src/content/assets/finetasks/data/hi/indicqa_hin_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:5c3eeceab13f6a5f3579f25e4745ff508c08cf22724ed2237e81959d2ddc40e6 +size 473 diff --git a/app/src/content/assets/finetasks/data/hi/indicxcopa_hin_cf_data.csv b/app/src/content/assets/finetasks/data/hi/indicxcopa_hin_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..156c7e245f5e2c29a108a7399526aa6fb038c08d --- /dev/null +++ b/app/src/content/assets/finetasks/data/hi/indicxcopa_hin_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:6b2d44f35b111f5da12c928e410bf0c7963e744455fcbde8bcd9f30f889051a3 +size 18203 diff --git a/app/src/content/assets/finetasks/data/hi/indicxcopa_hin_cf_stats.csv b/app/src/content/assets/finetasks/data/hi/indicxcopa_hin_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..1deb6423de265106b5f9eaa283cc97048f707135 --- /dev/null +++ b/app/src/content/assets/finetasks/data/hi/indicxcopa_hin_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c2578d591aaa75accfa3e2f8f46cc9811587d84abe44215cb50ab4865bad9688 +size 824 diff --git a/app/src/content/assets/finetasks/data/hi/meta_mmlu_hin_cf:_average_data.csv b/app/src/content/assets/finetasks/data/hi/meta_mmlu_hin_cf:_average_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..69b7a7730389cd7a6ce5aa7e895af84357d75d10 --- /dev/null +++ b/app/src/content/assets/finetasks/data/hi/meta_mmlu_hin_cf:_average_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:e104b4b0cafe3494f68c7b906744dd9a0bd861b8dbec5fa02e6aa215202673ac +size 18535 diff --git a/app/src/content/assets/finetasks/data/hi/meta_mmlu_hin_cf:_average_stats.csv b/app/src/content/assets/finetasks/data/hi/meta_mmlu_hin_cf:_average_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..91ccc703b1e27fbca5dfb11f9d4829dbaf5d5df0 --- /dev/null +++ b/app/src/content/assets/finetasks/data/hi/meta_mmlu_hin_cf:_average_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:d6e208742b4c2649d6524bc6836acab621c0e1d059b6ca470e0c7356d3059a7e +size 914 diff --git a/app/src/content/assets/finetasks/data/hi/mintaka_hin_data.csv b/app/src/content/assets/finetasks/data/hi/mintaka_hin_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..afbda813c0b9d6b57492157d6780f636a84a91bf --- /dev/null +++ b/app/src/content/assets/finetasks/data/hi/mintaka_hin_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:36f4123267b16ea279aa34427884bcf2f226719d37db48a4c3f0638491185f2b +size 12288 diff --git a/app/src/content/assets/finetasks/data/hi/mintaka_hin_stats.csv b/app/src/content/assets/finetasks/data/hi/mintaka_hin_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..dd728043d22fdd5d34866f93026cfb24e5f6334f --- /dev/null +++ b/app/src/content/assets/finetasks/data/hi/mintaka_hin_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:ff4b6dfb0eba06c7d53a17c43b73321aa8d699879564401fbb8c685d026e547d +size 488 diff --git a/app/src/content/assets/finetasks/data/hi/mlmm_arc_hin_cf:challenge_data.csv b/app/src/content/assets/finetasks/data/hi/mlmm_arc_hin_cf:challenge_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..f761e1f8551173ccc4226d8efd9be285272085f5 --- /dev/null +++ b/app/src/content/assets/finetasks/data/hi/mlmm_arc_hin_cf:challenge_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:cf682e7599585f9201537cef4b365e631f75e751bb75ed06487d063f8129e715 +size 13678 diff --git a/app/src/content/assets/finetasks/data/hi/mlmm_arc_hin_cf:challenge_stats.csv b/app/src/content/assets/finetasks/data/hi/mlmm_arc_hin_cf:challenge_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..dd43df44e15f19d17eeefd982ed5ed2d941c49d8 --- /dev/null +++ b/app/src/content/assets/finetasks/data/hi/mlmm_arc_hin_cf:challenge_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:470d366b3953cca6a91437f3bfd927cac3ff0693275d330e9e281a64125084fb +size 925 diff --git a/app/src/content/assets/finetasks/data/hi/mlmm_hellaswag_hin_cf_data.csv b/app/src/content/assets/finetasks/data/hi/mlmm_hellaswag_hin_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..bb42eaf249b45bbfe497c88f9b19b2f9b0d6325a --- /dev/null +++ b/app/src/content/assets/finetasks/data/hi/mlmm_hellaswag_hin_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:9bb91fcc2cdc0f05d81f51897aefcabc67425d12a3f6212f371ab70003a3e67e +size 13358 diff --git a/app/src/content/assets/finetasks/data/hi/mlmm_hellaswag_hin_cf_stats.csv b/app/src/content/assets/finetasks/data/hi/mlmm_hellaswag_hin_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..8f99f2d5a082c9f709e105285f99753574e60d77 --- /dev/null +++ b/app/src/content/assets/finetasks/data/hi/mlmm_hellaswag_hin_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:54919d77edd6822f166c5f97acb4cb287473095b31974ebc98a191fc31804646 +size 872 diff --git a/app/src/content/assets/finetasks/data/hi/mlmm_mmlu_hin_cf:_average_data.csv b/app/src/content/assets/finetasks/data/hi/mlmm_mmlu_hin_cf:_average_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..f1031bf77b5e555d3fac4fc87996d437a8027423 --- /dev/null +++ b/app/src/content/assets/finetasks/data/hi/mlmm_mmlu_hin_cf:_average_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:0e6a17767f41530698162af5d9e23e1a09c552e3dd150c9e04c401f9c5a0dbaf +size 18275 diff --git a/app/src/content/assets/finetasks/data/hi/mlmm_mmlu_hin_cf:_average_stats.csv b/app/src/content/assets/finetasks/data/hi/mlmm_mmlu_hin_cf:_average_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..6789663e5244d1297b01cc3c62f3a3fc18eb3562 --- /dev/null +++ b/app/src/content/assets/finetasks/data/hi/mlmm_mmlu_hin_cf:_average_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:4bd292bdf278be0a21430aa0e635c560d3d04628d7fcad85af1b9ad95835ed30 +size 929 diff --git a/app/src/content/assets/finetasks/data/hi/mlmm_truthfulqa_hin_cf:mc1_data.csv b/app/src/content/assets/finetasks/data/hi/mlmm_truthfulqa_hin_cf:mc1_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..139c6527e01338124b7a761769c67b71904b6db7 --- /dev/null +++ b/app/src/content/assets/finetasks/data/hi/mlmm_truthfulqa_hin_cf:mc1_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:2c43de528f51fa5a9378102c5b44f7a78703046052db36f9286ec366f3f493d6 +size 18213 diff --git a/app/src/content/assets/finetasks/data/hi/mlmm_truthfulqa_hin_cf:mc1_stats.csv b/app/src/content/assets/finetasks/data/hi/mlmm_truthfulqa_hin_cf:mc1_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..69737c0a89896b15ba4b70f0083600487990afa9 --- /dev/null +++ b/app/src/content/assets/finetasks/data/hi/mlmm_truthfulqa_hin_cf:mc1_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:e9c4e6197bdf339a18c80f6005206c3c885795c3cd28e1bfd8d8427b5e6f9d5c +size 910 diff --git a/app/src/content/assets/finetasks/data/hi/mlmm_truthfulqa_hin_cf:mc2_data.csv b/app/src/content/assets/finetasks/data/hi/mlmm_truthfulqa_hin_cf:mc2_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..b971300c637b7d62641a17c0fe91c7c33998cde9 --- /dev/null +++ b/app/src/content/assets/finetasks/data/hi/mlmm_truthfulqa_hin_cf:mc2_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:431be39fa9c5eb3656f7642482b488656f7246d880d16e6dd5e43670aecf74fa +size 18526 diff --git a/app/src/content/assets/finetasks/data/hi/mlmm_truthfulqa_hin_cf:mc2_stats.csv b/app/src/content/assets/finetasks/data/hi/mlmm_truthfulqa_hin_cf:mc2_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..c6676e37b1d3028203636b44c815a5994d4c0f88 --- /dev/null +++ b/app/src/content/assets/finetasks/data/hi/mlmm_truthfulqa_hin_cf:mc2_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:a1c8d3595e84c74654346b216513cecfedf867f47ad8658648f46996e71a8fb4 +size 913 diff --git a/app/src/content/assets/finetasks/data/hi/mlqa_hin_data.csv b/app/src/content/assets/finetasks/data/hi/mlqa_hin_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..8b91392d5f77df55b294c07c95a7a6b317346406 --- /dev/null +++ b/app/src/content/assets/finetasks/data/hi/mlqa_hin_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:abf9f3183ede2173da241f05a86c3939177f98714ffd6381903a1427f589b9ea +size 12021 diff --git a/app/src/content/assets/finetasks/data/hi/mlqa_hin_stats.csv b/app/src/content/assets/finetasks/data/hi/mlqa_hin_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..86b646dfcef007b1065b5c7c374e8cbfeab31e7d --- /dev/null +++ b/app/src/content/assets/finetasks/data/hi/mlqa_hin_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:3c9065314093537578df766fae92801ab95c3d6adb2db95843f58991779f5156 +size 469 diff --git a/app/src/content/assets/finetasks/data/hi/tydiqa_hin_data.csv b/app/src/content/assets/finetasks/data/hi/tydiqa_hin_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..e80bfcae519b9919a9434e60c3f714aa93aa2757 --- /dev/null +++ b/app/src/content/assets/finetasks/data/hi/tydiqa_hin_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7e91ca6e779ab1636af7d41cb10b7cf0b70f0612afa14362fb97e2b24b63738c +size 8846 diff --git a/app/src/content/assets/finetasks/data/hi/tydiqa_hin_stats.csv b/app/src/content/assets/finetasks/data/hi/tydiqa_hin_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..2593718d6dbd70c6db794a78563ba28a54d87048 --- /dev/null +++ b/app/src/content/assets/finetasks/data/hi/tydiqa_hin_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:a265a9093be40ca0b9f8d801b472356a25030c13a3e624681fe39e751f036890 +size 246 diff --git a/app/src/content/assets/finetasks/data/hi/xcodah_hin_cf_data.csv b/app/src/content/assets/finetasks/data/hi/xcodah_hin_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..66c9398a91ca605ea8b16205252646a6448caa65 --- /dev/null +++ b/app/src/content/assets/finetasks/data/hi/xcodah_hin_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:d495830ac36e435d96e3ef6617d851857eb6e56bcf664068bbdfe963235ae1ae +size 16607 diff --git a/app/src/content/assets/finetasks/data/hi/xcodah_hin_cf_stats.csv b/app/src/content/assets/finetasks/data/hi/xcodah_hin_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..9f395767eeaf9a706d030cc2c48f5737e05b51bf --- /dev/null +++ b/app/src/content/assets/finetasks/data/hi/xcodah_hin_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:3e1de1f659e2e3defc176579db3377b0501ae4da7067c0f6cfea33380b65e6dc +size 867 diff --git a/app/src/content/assets/finetasks/data/hi/xcsqa_hin_cf_data.csv b/app/src/content/assets/finetasks/data/hi/xcsqa_hin_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..879d8dec0b7722519544594913c9cd462a71e946 --- /dev/null +++ b/app/src/content/assets/finetasks/data/hi/xcsqa_hin_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:4b5636d38a0719c462b8875508811794d26671727cf528c41eac8b3c45a43bc8 +size 13799 diff --git a/app/src/content/assets/finetasks/data/hi/xcsqa_hin_cf_stats.csv b/app/src/content/assets/finetasks/data/hi/xcsqa_hin_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..fa43f61bf969d24c8a235df8090e5bb0b0d3a437 --- /dev/null +++ b/app/src/content/assets/finetasks/data/hi/xcsqa_hin_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:6e920a1c6d22c40787f42ce780feb73c83a8761e254c50791fce6645c2dc3596 +size 818 diff --git a/app/src/content/assets/finetasks/data/hi/xnli2.0_hin_cf_data.csv b/app/src/content/assets/finetasks/data/hi/xnli2.0_hin_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..7eaf08bd00254ac03e8b99668937a7e6ead2903e --- /dev/null +++ b/app/src/content/assets/finetasks/data/hi/xnli2.0_hin_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:d2bddfeda3c52f69c381a80b088631438a561945a36df0e2e0dbc32510f8c208 +size 13385 diff --git a/app/src/content/assets/finetasks/data/hi/xnli2.0_hin_cf_stats.csv b/app/src/content/assets/finetasks/data/hi/xnli2.0_hin_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..9dbd373fccbc0e007d466b295d87b08785abe212 --- /dev/null +++ b/app/src/content/assets/finetasks/data/hi/xnli2.0_hin_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:6c5b19fc1b44e3d2ef93241b4418a4c7f8ca176676a03a3ccd5e8abbd5ba1385 +size 854 diff --git a/app/src/content/assets/finetasks/data/hi/xnli_hin_cf_data.csv b/app/src/content/assets/finetasks/data/hi/xnli_hin_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..6895965038f7374398d02043bf38c5ae085d7ce5 --- /dev/null +++ b/app/src/content/assets/finetasks/data/hi/xnli_hin_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:3cc6ba913b73bea2f075191b1c59c51ccacbf2847b7383387327b7314a9677b3 +size 17357 diff --git a/app/src/content/assets/finetasks/data/hi/xnli_hin_cf_stats.csv b/app/src/content/assets/finetasks/data/hi/xnli_hin_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..4ef4394afba068a6fbb5c481a33ed282bc5779e9 --- /dev/null +++ b/app/src/content/assets/finetasks/data/hi/xnli_hin_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:bceb5539c9212f2760663936e9293fce5733acae175e1e3568327de9d58cb814 +size 1564 diff --git a/app/src/content/assets/finetasks/data/hi/xquad_hin_data.csv b/app/src/content/assets/finetasks/data/hi/xquad_hin_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..a543782e3288215587920d967206581198c4734a --- /dev/null +++ b/app/src/content/assets/finetasks/data/hi/xquad_hin_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:801766a9256b3eb22efb76269c563bda111bdf32793d0a5097d72e986eaf852b +size 12139 diff --git a/app/src/content/assets/finetasks/data/hi/xquad_hin_stats.csv b/app/src/content/assets/finetasks/data/hi/xquad_hin_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..dbdfcbac94c420ca8ecc3468d97927c4834b4fb9 --- /dev/null +++ b/app/src/content/assets/finetasks/data/hi/xquad_hin_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:278772974b87a4a2ef3ba2f679701ddc1be09731059d1e0c503da0e73d53af70 +size 479 diff --git a/app/src/content/assets/finetasks/data/hi/xstory_cloze_hin_cf_data.csv b/app/src/content/assets/finetasks/data/hi/xstory_cloze_hin_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..07d5c9e81ab41776ea0ab1fa323ae03c235b0bb4 --- /dev/null +++ b/app/src/content/assets/finetasks/data/hi/xstory_cloze_hin_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:2502b3416b0e68ab7d42341f08ba5cf71fb730a9a643921ad24366adb1b24ba8 +size 13990 diff --git a/app/src/content/assets/finetasks/data/hi/xstory_cloze_hin_cf_stats.csv b/app/src/content/assets/finetasks/data/hi/xstory_cloze_hin_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..854b75a688c0a42f1c38835290e15aca08bb5146 --- /dev/null +++ b/app/src/content/assets/finetasks/data/hi/xstory_cloze_hin_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:6aa11682636a768533b2263af38f926421da3669bad56fe94e52e4d0d7d61bb0 +size 864 diff --git a/app/src/content/assets/finetasks/data/ru/belebele_rus_Cyrl_cf_data.csv b/app/src/content/assets/finetasks/data/ru/belebele_rus_Cyrl_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..91acf490890422988d3c0f0ded48d4082b2616db --- /dev/null +++ b/app/src/content/assets/finetasks/data/ru/belebele_rus_Cyrl_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7a1104980f16ec54a9b1614d6da57de719996111f14c3eb797e41c39f22e8234 +size 21407 diff --git a/app/src/content/assets/finetasks/data/ru/belebele_rus_Cyrl_cf_stats.csv b/app/src/content/assets/finetasks/data/ru/belebele_rus_Cyrl_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..4e6d7d2d900535d1f7968fd12197794f73e970b4 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ru/belebele_rus_Cyrl_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:4d9c08fd6af7e2e2f9126a9ade7a022f75e3c89631fcb2a4b8bf427166ea51ca +size 782 diff --git a/app/src/content/assets/finetasks/data/ru/chegeka_rus_cf_data.csv b/app/src/content/assets/finetasks/data/ru/chegeka_rus_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..e9ed3e2e4cb1c7a7df1af4493b9cb0aa99c9603f --- /dev/null +++ b/app/src/content/assets/finetasks/data/ru/chegeka_rus_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:38adbb6fe3dfc3ce749272391a9b70452dfb33ef07d8707cccf624e2e223a591 +size 15024 diff --git a/app/src/content/assets/finetasks/data/ru/chegeka_rus_cf_stats.csv b/app/src/content/assets/finetasks/data/ru/chegeka_rus_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..5d10bc2b1147fc72f7d2861ac846d7e879a5b220 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ru/chegeka_rus_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:bf8e37662ecd9f1abe6f8521579f247294362053214fd86f43569e0b7f263027 +size 409 diff --git a/app/src/content/assets/finetasks/data/ru/chegeka_rus_data.csv b/app/src/content/assets/finetasks/data/ru/chegeka_rus_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..381eeb15f8736244255c769a39696be288f27713 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ru/chegeka_rus_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c53de1bd2ecf01baead7541700b6bed7930ab48e2862408f6138777102e66153 +size 15024 diff --git a/app/src/content/assets/finetasks/data/ru/chegeka_rus_stats.csv b/app/src/content/assets/finetasks/data/ru/chegeka_rus_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..420c5c1383985da1c6a0d9f4edbac6b433d13c30 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ru/chegeka_rus_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:87938a70dbc71193ed9bf498d708136ff77309246f0a800d8dddcc03d3ddfbe8 +size 403 diff --git a/app/src/content/assets/finetasks/data/ru/mathlogic_qa_rus_cf_data.csv b/app/src/content/assets/finetasks/data/ru/mathlogic_qa_rus_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..4839381145a13b753232f5e860f836b958198029 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ru/mathlogic_qa_rus_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:d5b540374b1d7861db97a8d3eed38cfabce5d89e4168a5d3a0c5c808eb98a633 +size 24128 diff --git a/app/src/content/assets/finetasks/data/ru/mathlogic_qa_rus_cf_stats.csv b/app/src/content/assets/finetasks/data/ru/mathlogic_qa_rus_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..9efb03b432cfb50a202c8f8b3a51a86d91db11cc --- /dev/null +++ b/app/src/content/assets/finetasks/data/ru/mathlogic_qa_rus_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:478ce53e47288ebd4bcef60823247201b95341b6842713063e75f12c5b9afb7c +size 894 diff --git a/app/src/content/assets/finetasks/data/ru/mera_openbookqa_rus_cf_data.csv b/app/src/content/assets/finetasks/data/ru/mera_openbookqa_rus_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..f6ae52d8e85cd1c43ab24ba98f79430ea1a95a55 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ru/mera_openbookqa_rus_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:e738224bb6341cd2e69255bf02dd3721b9d6f31ebb9d41985930bc4758229fae +size 18427 diff --git a/app/src/content/assets/finetasks/data/ru/mera_openbookqa_rus_cf_stats.csv b/app/src/content/assets/finetasks/data/ru/mera_openbookqa_rus_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..c2fe6fb879f0752aaa1cb421df794d6ed0e30da4 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ru/mera_openbookqa_rus_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:0d8699095d98f22df1f9b087dd9390f1fd468daed26a21e8fde73457677701ea +size 916 diff --git a/app/src/content/assets/finetasks/data/ru/mera_worldtree_rus_cf_data.csv b/app/src/content/assets/finetasks/data/ru/mera_worldtree_rus_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..1b851ca40d2bcf1117f67b597adeadac4a69be26 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ru/mera_worldtree_rus_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c6099ccd9dd34bdc8e21308dae9bfbf4ecef0f835067dec95d673e5a07ed8b1c +size 23535 diff --git a/app/src/content/assets/finetasks/data/ru/mera_worldtree_rus_cf_stats.csv b/app/src/content/assets/finetasks/data/ru/mera_worldtree_rus_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..dc3dd59c24bf919287b1dfa28c35c6732ad09b55 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ru/mera_worldtree_rus_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:9a87ff7d5a0adf2e97808e7699b5431916a371c438553ca398197bee7750a6bc +size 899 diff --git a/app/src/content/assets/finetasks/data/ru/mkqa_rus:_average_data.csv b/app/src/content/assets/finetasks/data/ru/mkqa_rus:_average_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..a02545c7e1eb5efff27f1195e10a47dab41a5fa0 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ru/mkqa_rus:_average_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:6ad85f4428d8663928981d4c9d4d223211644587e446827c0aba4385880df6c1 +size 17654 diff --git a/app/src/content/assets/finetasks/data/ru/mkqa_rus:_average_stats.csv b/app/src/content/assets/finetasks/data/ru/mkqa_rus:_average_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..24e4945a5afbc3b85218634b4abacdc8b4496381 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ru/mkqa_rus:_average_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:d96306c4f4cf2f468ed0fdb770ddc6a430c1be3b885833debb73cf36e88fa36c +size 495 diff --git a/app/src/content/assets/finetasks/data/ru/mlmm_arc_rus_cf:challenge_data.csv b/app/src/content/assets/finetasks/data/ru/mlmm_arc_rus_cf:challenge_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..2206df2c0988cdb5df4893c6358b0d8b1dedc80e --- /dev/null +++ b/app/src/content/assets/finetasks/data/ru/mlmm_arc_rus_cf:challenge_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:8c27749c0e906014f84c232bb9952e9fea5f24eec1ace173e3fee77a109509a2 +size 18081 diff --git a/app/src/content/assets/finetasks/data/ru/mlmm_arc_rus_cf:challenge_stats.csv b/app/src/content/assets/finetasks/data/ru/mlmm_arc_rus_cf:challenge_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..575fd7c5417723088657ecf2bc4ffb04e5cff1ad --- /dev/null +++ b/app/src/content/assets/finetasks/data/ru/mlmm_arc_rus_cf:challenge_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:74d48282b2c17e2c4d65d138db658b9965c210f950a49d0e61942f5f38a6b058 +size 911 diff --git a/app/src/content/assets/finetasks/data/ru/mlmm_hellaswag_rus_cf_data.csv b/app/src/content/assets/finetasks/data/ru/mlmm_hellaswag_rus_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..d5e70a6ff48468aad4a22c1f1bfc237a5505eb04 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ru/mlmm_hellaswag_rus_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:0ecf69fe4d2fd5df57147e9c74f03f36db77cf71f4188f1836d27ae1e3a4a335 +size 17386 diff --git a/app/src/content/assets/finetasks/data/ru/mlmm_hellaswag_rus_cf_stats.csv b/app/src/content/assets/finetasks/data/ru/mlmm_hellaswag_rus_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..7a2adad32bb7c894687f35df2655702ccb5075f4 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ru/mlmm_hellaswag_rus_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:60b1e1994e9447ba81e898fa4f44a80a42d70bb2c54d5d130b7b06a2249fbc99 +size 884 diff --git a/app/src/content/assets/finetasks/data/ru/mlmm_mmlu_rus_cf:_average_data.csv b/app/src/content/assets/finetasks/data/ru/mlmm_mmlu_rus_cf:_average_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..bb5183c42418a0d3a32d9dbf13fb7d9b68911ed6 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ru/mlmm_mmlu_rus_cf:_average_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:f2635e59283b0c633e82aa59b8cf9b0888bfea22b85b13687099f89c1adc2d26 +size 24199 diff --git a/app/src/content/assets/finetasks/data/ru/mlmm_mmlu_rus_cf:_average_stats.csv b/app/src/content/assets/finetasks/data/ru/mlmm_mmlu_rus_cf:_average_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..19f35c3d2881127eeb00e0c9f3eb333fcc21486b --- /dev/null +++ b/app/src/content/assets/finetasks/data/ru/mlmm_mmlu_rus_cf:_average_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:f11ac7ca75b3212af5460f8b32d556d110434fc64692e52e697f3d44477b9211 +size 898 diff --git a/app/src/content/assets/finetasks/data/ru/mlmm_truthfulqa_rus_cf:mc1_data.csv b/app/src/content/assets/finetasks/data/ru/mlmm_truthfulqa_rus_cf:mc1_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..b9d8d92a82f3b3aa641dcbc06b4371df3e119de4 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ru/mlmm_truthfulqa_rus_cf:mc1_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:dd9d8810bd57cdd62b6af413ece9324ee4c313ae7a2d215153a8578622f66ea2 +size 24243 diff --git a/app/src/content/assets/finetasks/data/ru/mlmm_truthfulqa_rus_cf:mc1_stats.csv b/app/src/content/assets/finetasks/data/ru/mlmm_truthfulqa_rus_cf:mc1_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..79ad5d2c68ab13dc9187ec86f514d1b713397361 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ru/mlmm_truthfulqa_rus_cf:mc1_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:67f23db643e63264a6b7138520ed371cb27e553988de9b2bb49588a2e92a7a27 +size 914 diff --git a/app/src/content/assets/finetasks/data/ru/mlmm_truthfulqa_rus_cf:mc2_data.csv b/app/src/content/assets/finetasks/data/ru/mlmm_truthfulqa_rus_cf:mc2_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..8f60eaba0feb865cdf162e642e9f079be9e8721e --- /dev/null +++ b/app/src/content/assets/finetasks/data/ru/mlmm_truthfulqa_rus_cf:mc2_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c5139127483bab6aa547574c30bf3b98b25e203f8eb1c3303cef76fe5579f075 +size 24737 diff --git a/app/src/content/assets/finetasks/data/ru/mlmm_truthfulqa_rus_cf:mc2_stats.csv b/app/src/content/assets/finetasks/data/ru/mlmm_truthfulqa_rus_cf:mc2_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..a5a9416548d6c956b2d7ddd4ef3ed951a48c6f3a --- /dev/null +++ b/app/src/content/assets/finetasks/data/ru/mlmm_truthfulqa_rus_cf:mc2_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1193c922230bbbc897aa1b3393dd1e3690091a0cd1fd72966e91bbf27d06cee5 +size 915 diff --git a/app/src/content/assets/finetasks/data/ru/parus_rus_cf_data.csv b/app/src/content/assets/finetasks/data/ru/parus_rus_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..8df6a6edfd739b90c10f1b0246ed73c731c22d6e --- /dev/null +++ b/app/src/content/assets/finetasks/data/ru/parus_rus_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:ae54d2d8039f5eac912357fe194b4b1c5308f1646d76dbacb545b31b8b5cfcee +size 18875 diff --git a/app/src/content/assets/finetasks/data/ru/parus_rus_cf_stats.csv b/app/src/content/assets/finetasks/data/ru/parus_rus_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..48735e7efabb701b07f1342b53a924dfc94bbbcc --- /dev/null +++ b/app/src/content/assets/finetasks/data/ru/parus_rus_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:effadf60b57d80829ac0b95cbdd6200d6d1a40ddb17e4a367035c8f6f283998a +size 864 diff --git a/app/src/content/assets/finetasks/data/ru/rcb_rus_cf_data.csv b/app/src/content/assets/finetasks/data/ru/rcb_rus_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..f91c41ed63a70c7f65ff6ade73455f4a83214a15 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ru/rcb_rus_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:2f0b3a0f45ba4ade954d7f211edf62e178796fda284074bbbbb9eb685cc51886 +size 23870 diff --git a/app/src/content/assets/finetasks/data/ru/rcb_rus_cf_stats.csv b/app/src/content/assets/finetasks/data/ru/rcb_rus_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..c843dc9852e05bb91c79f402dba469bb0a28bd48 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ru/rcb_rus_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:a9e77d502556d8c31968b545ad867680f8d220dc500a32f7da2a764df61d88cc +size 842 diff --git a/app/src/content/assets/finetasks/data/ru/rummlu_rus_cf:_average_data.csv b/app/src/content/assets/finetasks/data/ru/rummlu_rus_cf:_average_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..1830e26062cc5ca69178f395dac96137431b89ca --- /dev/null +++ b/app/src/content/assets/finetasks/data/ru/rummlu_rus_cf:_average_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:b9fbfff3571e2daf313959e500047e3772a4b4f6c24461de467185e660e29c0b +size 24511 diff --git a/app/src/content/assets/finetasks/data/ru/rummlu_rus_cf:_average_stats.csv b/app/src/content/assets/finetasks/data/ru/rummlu_rus_cf:_average_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..47dc680582317c467f0dad935daa45e4f0aab325 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ru/rummlu_rus_cf:_average_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:a548a848364803757b31fc6d5f1abd209c4a6a9d4b4961f5596b96679746662f +size 912 diff --git a/app/src/content/assets/finetasks/data/ru/sber_squad_rus_data.csv b/app/src/content/assets/finetasks/data/ru/sber_squad_rus_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..7c07d431985c388dd5b79c3bc63d658887fe8575 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ru/sber_squad_rus_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:2ece03f1d363099ecad7c6a82e04b162732a433ee8d3b116f5d5696a2bc91369 +size 15874 diff --git a/app/src/content/assets/finetasks/data/ru/sber_squad_rus_stats.csv b/app/src/content/assets/finetasks/data/ru/sber_squad_rus_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..f5b19f47739dd712c3ef1cb7a121ba5943b65ddd --- /dev/null +++ b/app/src/content/assets/finetasks/data/ru/sber_squad_rus_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:dcb8289c54752d02a690be8b054b018cd24e46d6399e35f2c76ee54715142356 +size 489 diff --git a/app/src/content/assets/finetasks/data/ru/tydiqa_rus_data.csv b/app/src/content/assets/finetasks/data/ru/tydiqa_rus_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..c79c4192f2978bde9ded5468169e051b3d3cf024 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ru/tydiqa_rus_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7ab853baa3f27dad975a337f1243b9f0e9deded53f9a04f528f4a3b46c9013a3 +size 17255 diff --git a/app/src/content/assets/finetasks/data/ru/tydiqa_rus_stats.csv b/app/src/content/assets/finetasks/data/ru/tydiqa_rus_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..1086e508a06d6ff8b1a885c7c19c0379536945cc --- /dev/null +++ b/app/src/content/assets/finetasks/data/ru/tydiqa_rus_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:3963b28a036f2d6b50f46d080b84aba84b9026add6c29e577793e6535ce18366 +size 478 diff --git a/app/src/content/assets/finetasks/data/ru/xcodah_rus_cf_data.csv b/app/src/content/assets/finetasks/data/ru/xcodah_rus_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..cf7cc88ad6f28ec829e1c2e48ced55028124ac4b --- /dev/null +++ b/app/src/content/assets/finetasks/data/ru/xcodah_rus_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:a61aa91f4566768ea2e5354f903139742f0babd1e20f08a4971de97acfb49d68 +size 21941 diff --git a/app/src/content/assets/finetasks/data/ru/xcodah_rus_cf_stats.csv b/app/src/content/assets/finetasks/data/ru/xcodah_rus_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..ca67970eadc1e8b5de648e2100efc5efc6ded9fe --- /dev/null +++ b/app/src/content/assets/finetasks/data/ru/xcodah_rus_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:a1a4ff06d886d5271e90e4a5eb5fd743eca1b11b779118fa70370bf9ba4b9559 +size 870 diff --git a/app/src/content/assets/finetasks/data/ru/xcsqa_rus_cf_data.csv b/app/src/content/assets/finetasks/data/ru/xcsqa_rus_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..88e6e8072b5f97ae1c5225fbbaeb779ccfc65812 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ru/xcsqa_rus_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:aab0cc36004ea19c3fcd9616f71ee6940b7ae836ab17c32aeebc9f12237529d9 +size 23891 diff --git a/app/src/content/assets/finetasks/data/ru/xcsqa_rus_cf_stats.csv b/app/src/content/assets/finetasks/data/ru/xcsqa_rus_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..dd8e99b341e3f94b8b81419c71f3685d3da66df6 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ru/xcsqa_rus_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:74cfa3adde0d358f8df9ab8f524e56809913f934eca626f3738f5954773ce8fe +size 870 diff --git a/app/src/content/assets/finetasks/data/ru/xnli2.0_rus_cf_data.csv b/app/src/content/assets/finetasks/data/ru/xnli2.0_rus_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..076e62051a4d2e07a0e1b7a0d4457a538f84fe68 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ru/xnli2.0_rus_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:a73520453ab4c3c14353881784de3bffcbb138070015d6c8a23eba31afe433bb +size 17220 diff --git a/app/src/content/assets/finetasks/data/ru/xnli2.0_rus_cf_stats.csv b/app/src/content/assets/finetasks/data/ru/xnli2.0_rus_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..d265ed9af43705cb2420a89bdab82910187ad6af --- /dev/null +++ b/app/src/content/assets/finetasks/data/ru/xnli2.0_rus_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:4d38f6dbe4b31ec081b6799d62b86d1b09edc9cce40bcd8dc4d315f63103a188 +size 871 diff --git a/app/src/content/assets/finetasks/data/ru/xnli_rus_cf_data.csv b/app/src/content/assets/finetasks/data/ru/xnli_rus_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..5d98c1cd5260eca99fea5f8bd7a626d56bb4941a --- /dev/null +++ b/app/src/content/assets/finetasks/data/ru/xnli_rus_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:d55e9fe16b6a720627ecfb72064ea823070b862eca4b35676880c7cb79c34c9e +size 16701 diff --git a/app/src/content/assets/finetasks/data/ru/xnli_rus_cf_stats.csv b/app/src/content/assets/finetasks/data/ru/xnli_rus_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..e3c646d935e512896563d886cb990ac96d936916 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ru/xnli_rus_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:008919d250f1f4b8cca34199d506db0303a31b12dd7eb0264476cf4e379d7086 +size 858 diff --git a/app/src/content/assets/finetasks/data/ru/xquad_rus_data.csv b/app/src/content/assets/finetasks/data/ru/xquad_rus_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..22a74b0e3c72934c768bcde4bec6981248dbe9f8 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ru/xquad_rus_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:43d8dbf939a38ca119fdb8927aaa8727ca53183a4a7b8c6fea086f4ff740e4b2 +size 15689 diff --git a/app/src/content/assets/finetasks/data/ru/xquad_rus_stats.csv b/app/src/content/assets/finetasks/data/ru/xquad_rus_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..d1a1ead20b666e586e35644d7ea18335741ac09a --- /dev/null +++ b/app/src/content/assets/finetasks/data/ru/xquad_rus_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1364814176aaed6055fbd804963ac685ecbf3bdaf898d8559f96a4c87776425c +size 480 diff --git a/app/src/content/assets/finetasks/data/ru/xstory_cloze_rus_cf_data.csv b/app/src/content/assets/finetasks/data/ru/xstory_cloze_rus_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..c0e33eb83ba41a47416502b80366123f6f981988 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ru/xstory_cloze_rus_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:6c62c006b96a1ff2024ed7e82a3f25438f1f42c104d78e5dceaa52977daf509a +size 17583 diff --git a/app/src/content/assets/finetasks/data/ru/xstory_cloze_rus_cf_stats.csv b/app/src/content/assets/finetasks/data/ru/xstory_cloze_rus_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..7de25dde4e9f94e1682486135bb3c40d1c4c236b --- /dev/null +++ b/app/src/content/assets/finetasks/data/ru/xstory_cloze_rus_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:35f464d107d8494dcd7deebb19984282cd0a5eda62a6d3657a65e4e812b3ab69 +size 865 diff --git a/app/src/content/assets/finetasks/data/ru/xwinograd_rus_cf_data.csv b/app/src/content/assets/finetasks/data/ru/xwinograd_rus_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..11e88ae0ce6aeda3739da9b89d73ebd5cc4c0ad9 --- /dev/null +++ b/app/src/content/assets/finetasks/data/ru/xwinograd_rus_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:198040cc63726fd70f4185dbc45a1e002e991f208abb82ec27591edff3294048 +size 23848 diff --git a/app/src/content/assets/finetasks/data/ru/xwinograd_rus_cf_stats.csv b/app/src/content/assets/finetasks/data/ru/xwinograd_rus_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..4eec2584b73dcac5e3fe9fb5f3cd8f11849bf1cc --- /dev/null +++ b/app/src/content/assets/finetasks/data/ru/xwinograd_rus_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:b17cadf77c3495be53d8dcc30965fadadd05ecfe354c232e89227885189483e7 +size 881 diff --git a/app/src/content/assets/finetasks/data/sw/afric_mmlu_swa_cf:_average_data.csv b/app/src/content/assets/finetasks/data/sw/afric_mmlu_swa_cf:_average_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..573d415630f3ec45d5bb284cd642afb19780bac0 --- /dev/null +++ b/app/src/content/assets/finetasks/data/sw/afric_mmlu_swa_cf:_average_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:619d632551b5a1c35853d71e26eb6fb9d57486b4e17c362e1cb3ee5d858b8220 +size 9220 diff --git a/app/src/content/assets/finetasks/data/sw/afric_mmlu_swa_cf:_average_stats.csv b/app/src/content/assets/finetasks/data/sw/afric_mmlu_swa_cf:_average_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..f867e0dfaf9312a04aa7cc108f589705e8557881 --- /dev/null +++ b/app/src/content/assets/finetasks/data/sw/afric_mmlu_swa_cf:_average_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:b166a2461d3a870eb5c1b1c6480fc9aa52e01610a75bd65e9a8a19affd4b40ee +size 865 diff --git a/app/src/content/assets/finetasks/data/sw/afric_xnli_swa_cf_data.csv b/app/src/content/assets/finetasks/data/sw/afric_xnli_swa_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..0bf52e8adfb7b1cf27302c162795c83c15f58383 --- /dev/null +++ b/app/src/content/assets/finetasks/data/sw/afric_xnli_swa_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:4008b1c613e52c4962ede0c29920ff2e86badcbb00896029ba342794d779cc4b +size 10634 diff --git a/app/src/content/assets/finetasks/data/sw/afric_xnli_swa_cf_stats.csv b/app/src/content/assets/finetasks/data/sw/afric_xnli_swa_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..15f01d4500137f7bbc7ff4706f86a2f65e36e3be --- /dev/null +++ b/app/src/content/assets/finetasks/data/sw/afric_xnli_swa_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c78b906fc4e09fd5df0f4283c76adeb1448a796d2e43fb12d1aa1d9323253984 +size 650 diff --git a/app/src/content/assets/finetasks/data/sw/belebele_swh_Latn_cf_data.csv b/app/src/content/assets/finetasks/data/sw/belebele_swh_Latn_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..66e7d83c7d4e29d84fa2a77433a280ac0bd9386c --- /dev/null +++ b/app/src/content/assets/finetasks/data/sw/belebele_swh_Latn_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:0061bed933b0af3706e2f27a358d049714a94483dbcb511996f70b2d56cfe292 +size 10830 diff --git a/app/src/content/assets/finetasks/data/sw/belebele_swh_Latn_cf_stats.csv b/app/src/content/assets/finetasks/data/sw/belebele_swh_Latn_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..2f8dd6ddfe7667b92f951dfd409017fd10bfbfde --- /dev/null +++ b/app/src/content/assets/finetasks/data/sw/belebele_swh_Latn_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:b29449da109212b7ceabe975091ced3be2d9a0f935bfdb1959967ab35d6ce357 +size 740 diff --git a/app/src/content/assets/finetasks/data/sw/community_arc_swa_cf:challenge_data.csv b/app/src/content/assets/finetasks/data/sw/community_arc_swa_cf:challenge_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..2b9fa1a876eddfac9fb55ac668f3392aebcc768e --- /dev/null +++ b/app/src/content/assets/finetasks/data/sw/community_arc_swa_cf:challenge_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:e0ccfd14e0fa62b337665889c68042603e0b51ead9dbca07e013a3d0c32f5c28 +size 9220 diff --git a/app/src/content/assets/finetasks/data/sw/community_arc_swa_cf:challenge_stats.csv b/app/src/content/assets/finetasks/data/sw/community_arc_swa_cf:challenge_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..5cc5f08efaf8260a9a4d1ffbe5e410658ca907a4 --- /dev/null +++ b/app/src/content/assets/finetasks/data/sw/community_arc_swa_cf:challenge_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:63b4ed71dcf4edeaef658c544da9de68af1422b421772bb9650df42b62e7366a +size 891 diff --git a/app/src/content/assets/finetasks/data/sw/community_arc_swa_cf:easy_data.csv b/app/src/content/assets/finetasks/data/sw/community_arc_swa_cf:easy_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..ff18b0973c33ec6ec955bb246442ec5f0169a605 --- /dev/null +++ b/app/src/content/assets/finetasks/data/sw/community_arc_swa_cf:easy_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:100c7658b13b1e5f826b7ec1985250caaf938538323e0cb61dcc3b8b9642ea1c +size 8902 diff --git a/app/src/content/assets/finetasks/data/sw/community_arc_swa_cf:easy_stats.csv b/app/src/content/assets/finetasks/data/sw/community_arc_swa_cf:easy_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..9fb7fe760fe0bb057f0372a4efa7bd7d203c69e5 --- /dev/null +++ b/app/src/content/assets/finetasks/data/sw/community_arc_swa_cf:easy_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:344694c0dfc23342014b874b933b36abc4f38bef835f41ae28f509efb46f5b5a +size 862 diff --git a/app/src/content/assets/finetasks/data/sw/community_mmlu_swa_cf_data.csv b/app/src/content/assets/finetasks/data/sw/community_mmlu_swa_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..17a58fd9e13ad7897d8d7960017d32124f5f0337 --- /dev/null +++ b/app/src/content/assets/finetasks/data/sw/community_mmlu_swa_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:87d13f448445c052345c5cc0a35cdbe06f24abaf79029d6197e3e3e67a5ed941 +size 12348 diff --git a/app/src/content/assets/finetasks/data/sw/community_mmlu_swa_cf_stats.csv b/app/src/content/assets/finetasks/data/sw/community_mmlu_swa_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..429e6738390f6c8cde92c6e1ecd7cc5fe6ac1b06 --- /dev/null +++ b/app/src/content/assets/finetasks/data/sw/community_mmlu_swa_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:28ccf34f3f914e815e374e9e1a370903c589ad7a87f1505e3abca4a37ac529b2 +size 846 diff --git a/app/src/content/assets/finetasks/data/sw/kenswquad_swa_data.csv b/app/src/content/assets/finetasks/data/sw/kenswquad_swa_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..2487ddf4f78f2be9d0432216723f094f6291516e --- /dev/null +++ b/app/src/content/assets/finetasks/data/sw/kenswquad_swa_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:24cb9545f75fa351fd967b25539767ad9795a8bfa80327cb25e3fa8e9506bb29 +size 7924 diff --git a/app/src/content/assets/finetasks/data/sw/kenswquad_swa_stats.csv b/app/src/content/assets/finetasks/data/sw/kenswquad_swa_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..4cbf84fd69c3e60ba7910daebea566ec1e00a533 --- /dev/null +++ b/app/src/content/assets/finetasks/data/sw/kenswquad_swa_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:3e411b7503f7cc5033d2cba686d03a8b60a5a650e7f162c3fc05233903f6a915 +size 377 diff --git a/app/src/content/assets/finetasks/data/sw/m3exams_swa_cf_data.csv b/app/src/content/assets/finetasks/data/sw/m3exams_swa_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..48e88932133baf2578fc8723dce6185cbb395e0d --- /dev/null +++ b/app/src/content/assets/finetasks/data/sw/m3exams_swa_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:d42eebc6b29ea8503297fb547742f78c128b02ed488c7974d3b709a09eef1ad1 +size 12088 diff --git a/app/src/content/assets/finetasks/data/sw/m3exams_swa_cf_stats.csv b/app/src/content/assets/finetasks/data/sw/m3exams_swa_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..4b9ceb49fd5218019c9e835e69c696f22a5379c4 --- /dev/null +++ b/app/src/content/assets/finetasks/data/sw/m3exams_swa_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c0fcb8a52b13abfd7af289fc6fe4104d100341e4603b8f6d61c50cc0f1fa5325 +size 819 diff --git a/app/src/content/assets/finetasks/data/sw/openai_mmlu_swa_cf:_average_data.csv b/app/src/content/assets/finetasks/data/sw/openai_mmlu_swa_cf:_average_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..cf07dd70b3e86cde3e6fbf8a701d4cf1f9e652a8 --- /dev/null +++ b/app/src/content/assets/finetasks/data/sw/openai_mmlu_swa_cf:_average_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:4aead6b3251813fe90d29bdf886cd243f200de7d399b04d4d35b884920e6e6fd +size 12357 diff --git a/app/src/content/assets/finetasks/data/sw/openai_mmlu_swa_cf:_average_stats.csv b/app/src/content/assets/finetasks/data/sw/openai_mmlu_swa_cf:_average_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..dad87cb8e9b6794a843d11c2bafdeb4ac61ad82a --- /dev/null +++ b/app/src/content/assets/finetasks/data/sw/openai_mmlu_swa_cf:_average_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:a328d150e99eed8a3acc1286e1e2fea3a8321f866da01a92bd14537e9838d91f +size 875 diff --git a/app/src/content/assets/finetasks/data/sw/tydiqa_swa_data.csv b/app/src/content/assets/finetasks/data/sw/tydiqa_swa_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..b9ba5da5dc8a505dc3d1895c367d0d9aa1f3405a --- /dev/null +++ b/app/src/content/assets/finetasks/data/sw/tydiqa_swa_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1cd10c1554f489b1e9e652ce0c48d43594ee6144cd75861c3243569017663276 +size 8994 diff --git a/app/src/content/assets/finetasks/data/sw/tydiqa_swa_stats.csv b/app/src/content/assets/finetasks/data/sw/tydiqa_swa_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..e0e5100acbce172105a536ec102481fc59608903 --- /dev/null +++ b/app/src/content/assets/finetasks/data/sw/tydiqa_swa_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:6a5b281c289ffc7bd635d3b55d3afec374d096924f68cfbd9d19035f1bd04a33 +size 447 diff --git a/app/src/content/assets/finetasks/data/sw/xcodah_swa_cf_data.csv b/app/src/content/assets/finetasks/data/sw/xcodah_swa_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..a63a1a6524dfefa8281fc254951d7f8cc1c6021b --- /dev/null +++ b/app/src/content/assets/finetasks/data/sw/xcodah_swa_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1c86fa528cfcfb99e638faa9e636f94643ac4653f7636d8384e811176eb95cdd +size 11432 diff --git a/app/src/content/assets/finetasks/data/sw/xcodah_swa_cf_stats.csv b/app/src/content/assets/finetasks/data/sw/xcodah_swa_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..b95e81feadeef6d42386cc984a87f882ae6d46f5 --- /dev/null +++ b/app/src/content/assets/finetasks/data/sw/xcodah_swa_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:b1e06c282cc3a7e2f75a3826977e907e2b338a42b13822404ce8ea6b4449e463 +size 811 diff --git a/app/src/content/assets/finetasks/data/sw/xcopa_swa_cf_data.csv b/app/src/content/assets/finetasks/data/sw/xcopa_swa_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..4b7cce4daa398adf9ca589d5bcef312764e71101 --- /dev/null +++ b/app/src/content/assets/finetasks/data/sw/xcopa_swa_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:db9b34772300149cf6d2fd261a6b1ac06f6c9255a4bdb1b6cd96717a45a489ec +size 8912 diff --git a/app/src/content/assets/finetasks/data/sw/xcopa_swa_cf_stats.csv b/app/src/content/assets/finetasks/data/sw/xcopa_swa_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..b89e6167c45e320328f1080e98aeebcbe35cd4d3 --- /dev/null +++ b/app/src/content/assets/finetasks/data/sw/xcopa_swa_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:b8c30c0bb717cbd782d63c2300ad721ebf2161100a836559e7019aadd6b55b28 +size 804 diff --git a/app/src/content/assets/finetasks/data/sw/xcsqa_swa_cf_data.csv b/app/src/content/assets/finetasks/data/sw/xcsqa_swa_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..7805173ca60e9442bab9963b42800060cf74bb1a --- /dev/null +++ b/app/src/content/assets/finetasks/data/sw/xcsqa_swa_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:9142822062525239066caeb93c9e85e34fac26eaf8434a80f4ac150669478287 +size 9594 diff --git a/app/src/content/assets/finetasks/data/sw/xcsqa_swa_cf_stats.csv b/app/src/content/assets/finetasks/data/sw/xcsqa_swa_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..91c2606f316a407e3459ffcb304f5e1646d77601 --- /dev/null +++ b/app/src/content/assets/finetasks/data/sw/xcsqa_swa_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:2e97daa59f874a724bfe0c793e45b66a0cb8dda8d504299d1af97fa7a6a3491c +size 807 diff --git a/app/src/content/assets/finetasks/data/sw/xnli2.0_swa_cf_data.csv b/app/src/content/assets/finetasks/data/sw/xnli2.0_swa_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..b1c48fa1d364913e86cd973149e959957a98313d --- /dev/null +++ b/app/src/content/assets/finetasks/data/sw/xnli2.0_swa_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:188be98a22b0d1cb980438851be8ee5e078aa9246dcdf1ea1458de56a4da6cee +size 9485 diff --git a/app/src/content/assets/finetasks/data/sw/xnli2.0_swa_cf_stats.csv b/app/src/content/assets/finetasks/data/sw/xnli2.0_swa_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..6d60e70e786be56655f193558a411990f95d6f49 --- /dev/null +++ b/app/src/content/assets/finetasks/data/sw/xnli2.0_swa_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:d844793e1f18f270e7a8535acfcf539d35f8489a3a4e61961131f8aafecc6dd4 +size 815 diff --git a/app/src/content/assets/finetasks/data/sw/xnli_swa_cf_data.csv b/app/src/content/assets/finetasks/data/sw/xnli_swa_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..35b0092a9bad5811c3b6615a342cd9336cf0af23 --- /dev/null +++ b/app/src/content/assets/finetasks/data/sw/xnli_swa_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:3badd7960cde8d7ca8ccde51f7a5c5eb76f1c8385fbdff0a9d4108283b96e6b6 +size 13989 diff --git a/app/src/content/assets/finetasks/data/sw/xnli_swa_cf_stats.csv b/app/src/content/assets/finetasks/data/sw/xnli_swa_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..841f92636eac5a9e910cad4d729c3cb823d1d19c --- /dev/null +++ b/app/src/content/assets/finetasks/data/sw/xnli_swa_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:8ec544afaf48ab7bd6bde3811d8cd1b9b8a2e4bd1cd7a69517962f65b69926a9 +size 1332 diff --git a/app/src/content/assets/finetasks/data/sw/xstory_cloze_swa_cf_data.csv b/app/src/content/assets/finetasks/data/sw/xstory_cloze_swa_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..9225eeffcff9fa3cecee8002b2c887fc6e1b0866 --- /dev/null +++ b/app/src/content/assets/finetasks/data/sw/xstory_cloze_swa_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c7e6a426baa080c91df3bc42d68644dd0dc50b9cda247e6cb5c6f43dc4ba4eff +size 10354 diff --git a/app/src/content/assets/finetasks/data/sw/xstory_cloze_swa_cf_stats.csv b/app/src/content/assets/finetasks/data/sw/xstory_cloze_swa_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..96b56fbd38ed3df0284cc406be81978a737b883d --- /dev/null +++ b/app/src/content/assets/finetasks/data/sw/xstory_cloze_swa_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:68f3b16a1b349644edb572fd5b041c9a5c5e54d5e652fa09ca5e6ebe569bada3 +size 843 diff --git a/app/src/content/assets/finetasks/data/te/belebele_tel_Telu_cf_data.csv b/app/src/content/assets/finetasks/data/te/belebele_tel_Telu_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..ff2daa4bda28cd3d1da59a999e02cc5bbc4b175c --- /dev/null +++ b/app/src/content/assets/finetasks/data/te/belebele_tel_Telu_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:647e4c040d05a05bbe499f24b88bea07c0daea15f81ba2df55153c7a8ba94f21 +size 10715 diff --git a/app/src/content/assets/finetasks/data/te/belebele_tel_Telu_cf_stats.csv b/app/src/content/assets/finetasks/data/te/belebele_tel_Telu_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..f577be26efb07fdea95609d5b129537f872b9cc2 --- /dev/null +++ b/app/src/content/assets/finetasks/data/te/belebele_tel_Telu_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:381ef612429b55ff9c30098426c6bbfa54e97d0297f18e6a143bdc1bc20d6df9 +size 738 diff --git a/app/src/content/assets/finetasks/data/te/community_hellaswag_tel_cf_data.csv b/app/src/content/assets/finetasks/data/te/community_hellaswag_tel_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..bd1536b7ce625bee59f1a50967238c73c07daa67 --- /dev/null +++ b/app/src/content/assets/finetasks/data/te/community_hellaswag_tel_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:e94664acbb42bffa1d249499b38ee8214672baf8268cfee4914697b06eb91acf +size 10690 diff --git a/app/src/content/assets/finetasks/data/te/community_hellaswag_tel_cf_stats.csv b/app/src/content/assets/finetasks/data/te/community_hellaswag_tel_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..f3b47d230b3162ba1f3687c86b17d6e83e490793 --- /dev/null +++ b/app/src/content/assets/finetasks/data/te/community_hellaswag_tel_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:18326e431258a9bf9b1ca5c0d206f1e04b44fb8ee181049dfa76c0016db4b13f +size 857 diff --git a/app/src/content/assets/finetasks/data/te/indicnxnli_tel_cf_data.csv b/app/src/content/assets/finetasks/data/te/indicnxnli_tel_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..457a3e3ccfa95c37f1df7ce0ecd57af05a319a27 --- /dev/null +++ b/app/src/content/assets/finetasks/data/te/indicnxnli_tel_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:579f33d05bed88001b352531ba83930662c551ab3b8d9cf0d3c6f12ac34533b3 +size 8459 diff --git a/app/src/content/assets/finetasks/data/te/indicnxnli_tel_cf_stats.csv b/app/src/content/assets/finetasks/data/te/indicnxnli_tel_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..eefcc6eb15d56ef39db2d1851148b6a63bee0663 --- /dev/null +++ b/app/src/content/assets/finetasks/data/te/indicnxnli_tel_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:a388bcbc254097f56961d1bb168457081d12be49b2b7d5d5c41ee7e88ac86fb7 +size 835 diff --git a/app/src/content/assets/finetasks/data/te/indicqa_tel_data.csv b/app/src/content/assets/finetasks/data/te/indicqa_tel_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..14340b82f746b0f61545168e3c248f357c6bb844 --- /dev/null +++ b/app/src/content/assets/finetasks/data/te/indicqa_tel_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c12f3800f1acb5768ed12388b598c2dc1d79c4aff865c58521b939b60970290f +size 9354 diff --git a/app/src/content/assets/finetasks/data/te/indicqa_tel_stats.csv b/app/src/content/assets/finetasks/data/te/indicqa_tel_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..24f3279e8a9ec8887d6e4c25ed03f87cf654ba18 --- /dev/null +++ b/app/src/content/assets/finetasks/data/te/indicqa_tel_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:778521f806ea8e46131667cdb2325fcebdd752b938f540898728bfebd69c2625 +size 459 diff --git a/app/src/content/assets/finetasks/data/te/indicxcopa_tel_cf_data.csv b/app/src/content/assets/finetasks/data/te/indicxcopa_tel_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..34ee18255c20ccfd35fbd586cd4c7608f46285df --- /dev/null +++ b/app/src/content/assets/finetasks/data/te/indicxcopa_tel_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:bafb40ca092a0f848046319d8ed790a6e0e9a63526507069acacd692ed50d660 +size 9399 diff --git a/app/src/content/assets/finetasks/data/te/indicxcopa_tel_cf_stats.csv b/app/src/content/assets/finetasks/data/te/indicxcopa_tel_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..1ac1fb5343708b9934a36fbd03bfbc0c30ba068b --- /dev/null +++ b/app/src/content/assets/finetasks/data/te/indicxcopa_tel_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:67df9434da0afc2e38b2dbc0b5715680b552995bacc793a8f90769987aec1d62 +size 811 diff --git a/app/src/content/assets/finetasks/data/te/mlmm_arc_tel_cf:challenge_data.csv b/app/src/content/assets/finetasks/data/te/mlmm_arc_tel_cf:challenge_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..3eabce664603651d40420cbdc1b2f32c4134ad8b --- /dev/null +++ b/app/src/content/assets/finetasks/data/te/mlmm_arc_tel_cf:challenge_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:007604a5f9ba7a187397ce48cc94fb33b8d934ec7f41943743520c830e586ee7 +size 9293 diff --git a/app/src/content/assets/finetasks/data/te/mlmm_arc_tel_cf:challenge_stats.csv b/app/src/content/assets/finetasks/data/te/mlmm_arc_tel_cf:challenge_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..b66a37054a9901aaff2e50ca6b7ec536181d2827 --- /dev/null +++ b/app/src/content/assets/finetasks/data/te/mlmm_arc_tel_cf:challenge_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:93699f66d9719bae26cedd313cde247c1e76cf48b7cc270067afb69b6c78db4d +size 873 diff --git a/app/src/content/assets/finetasks/data/te/mlmm_hellaswag_tel_cf_data.csv b/app/src/content/assets/finetasks/data/te/mlmm_hellaswag_tel_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..58e8c081c01035d6f5725303ed78104965aec407 --- /dev/null +++ b/app/src/content/assets/finetasks/data/te/mlmm_hellaswag_tel_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:0cfe2294af567263caa4c696422820a0f04f5999fd68c819c7779ad619decc31 +size 8968 diff --git a/app/src/content/assets/finetasks/data/te/mlmm_hellaswag_tel_cf_stats.csv b/app/src/content/assets/finetasks/data/te/mlmm_hellaswag_tel_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..6c32116b75ac66d999876f2cf3ea77e6eb7f1dde --- /dev/null +++ b/app/src/content/assets/finetasks/data/te/mlmm_hellaswag_tel_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c1649f19dc74ec34946181aa88b9824ca0a205b61c7c2e248ebaf05e4f009168 +size 847 diff --git a/app/src/content/assets/finetasks/data/te/mlmm_mmlu_tel_cf:_average_data.csv b/app/src/content/assets/finetasks/data/te/mlmm_mmlu_tel_cf:_average_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..9c450b36aabb8be56ec54a51ac1ec8ac0d5c3142 --- /dev/null +++ b/app/src/content/assets/finetasks/data/te/mlmm_mmlu_tel_cf:_average_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:2c645bf055f085a04a0d88c6fc3c487b2af035edd2b296a213fbf4dccef8ceff +size 12406 diff --git a/app/src/content/assets/finetasks/data/te/mlmm_mmlu_tel_cf:_average_stats.csv b/app/src/content/assets/finetasks/data/te/mlmm_mmlu_tel_cf:_average_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..fdee95094f05909303434e4ae47a932f4a903bfb --- /dev/null +++ b/app/src/content/assets/finetasks/data/te/mlmm_mmlu_tel_cf:_average_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:54fc83af883cb498af0f6a22c8deac6cd477bd2f78906f8a44aff5bda377839d +size 871 diff --git a/app/src/content/assets/finetasks/data/te/mlmm_truthfulqa_tel_cf:mc1_data.csv b/app/src/content/assets/finetasks/data/te/mlmm_truthfulqa_tel_cf:mc1_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..5a450d10fac5dff1a7b3b919cc0976aedffbda88 --- /dev/null +++ b/app/src/content/assets/finetasks/data/te/mlmm_truthfulqa_tel_cf:mc1_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1ad99e88d26f4ee07ec84531f0782e47c8fa0ef62a2c2c3e4a7263a2eedeef3e +size 12331 diff --git a/app/src/content/assets/finetasks/data/te/mlmm_truthfulqa_tel_cf:mc1_stats.csv b/app/src/content/assets/finetasks/data/te/mlmm_truthfulqa_tel_cf:mc1_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..9e4cf38c1e6944b10427a844c8248594a6588b2d --- /dev/null +++ b/app/src/content/assets/finetasks/data/te/mlmm_truthfulqa_tel_cf:mc1_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:e2012b5e6d306712f0ffe252656f4e92f14d89d1d726d99137be97d5090a8b6b +size 871 diff --git a/app/src/content/assets/finetasks/data/te/mlmm_truthfulqa_tel_cf:mc2_data.csv b/app/src/content/assets/finetasks/data/te/mlmm_truthfulqa_tel_cf:mc2_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..dd59947095976a57ab64b1d539b6eeee47a0312b --- /dev/null +++ b/app/src/content/assets/finetasks/data/te/mlmm_truthfulqa_tel_cf:mc2_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:dca5fe7883543618846cfaacf5a8d675a07318fdb7339fbc55e3526764fdd8b0 +size 12781 diff --git a/app/src/content/assets/finetasks/data/te/mlmm_truthfulqa_tel_cf:mc2_stats.csv b/app/src/content/assets/finetasks/data/te/mlmm_truthfulqa_tel_cf:mc2_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..54922c7d0c69ea96d7e15343e434f3aeef198df1 --- /dev/null +++ b/app/src/content/assets/finetasks/data/te/mlmm_truthfulqa_tel_cf:mc2_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:d73adc0dea379297efdaf88cca40c664b2b7d014f25040a048f7ae58378179f0 +size 878 diff --git a/app/src/content/assets/finetasks/data/te/tydiqa_tel_data.csv b/app/src/content/assets/finetasks/data/te/tydiqa_tel_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..3d375b157dfeb4365284393c4ad029f9f7722157 --- /dev/null +++ b/app/src/content/assets/finetasks/data/te/tydiqa_tel_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:242772b4ddb4f5ee052d7fddf9c7a3c4abc04da377dc517ac09990fcecedae9a +size 8979 diff --git a/app/src/content/assets/finetasks/data/te/tydiqa_tel_stats.csv b/app/src/content/assets/finetasks/data/te/tydiqa_tel_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..1a04ae5c8347f9b61555b9028bb9e1cccdea6fd5 --- /dev/null +++ b/app/src/content/assets/finetasks/data/te/tydiqa_tel_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:f6aa8056b07542533927406650d5bd89214975c46f99cabad35a7af795927436 +size 453 diff --git a/app/src/content/assets/finetasks/data/te/xstory_cloze_tel_cf_data.csv b/app/src/content/assets/finetasks/data/te/xstory_cloze_tel_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..e92d5f56bdb671841a4216cd75d0a012ed5a93df --- /dev/null +++ b/app/src/content/assets/finetasks/data/te/xstory_cloze_tel_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:0cac9a86124a67e7f525511771bc0200707659ddc36271637790c7e1120be23d +size 9428 diff --git a/app/src/content/assets/finetasks/data/te/xstory_cloze_tel_cf_stats.csv b/app/src/content/assets/finetasks/data/te/xstory_cloze_tel_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..a67de52350ce95792c2c6e5d490753911459902c --- /dev/null +++ b/app/src/content/assets/finetasks/data/te/xstory_cloze_tel_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:af41acd284fbd26359de67d17886a341a849a7164699fe37a8e0d49c7651b62e +size 834 diff --git a/app/src/content/assets/finetasks/data/th/belebele_tha_Thai_cf_data.csv b/app/src/content/assets/finetasks/data/th/belebele_tha_Thai_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..bc9646669a92b169d4c480290204741859ba89df --- /dev/null +++ b/app/src/content/assets/finetasks/data/th/belebele_tha_Thai_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:6d979f8e3a6fd703974923969a9a3401a979f700dc6549abd0a0e696b6a97afb +size 21224 diff --git a/app/src/content/assets/finetasks/data/th/belebele_tha_Thai_cf_stats.csv b/app/src/content/assets/finetasks/data/th/belebele_tha_Thai_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..66432f5a1504af492faeae4fb274522b64610732 --- /dev/null +++ b/app/src/content/assets/finetasks/data/th/belebele_tha_Thai_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:15402b17ac88966242c97ada21b719ac99f7f3944448f6f455f898900e3e2c2f +size 782 diff --git a/app/src/content/assets/finetasks/data/th/community_hellaswag_tha_cf_data.csv b/app/src/content/assets/finetasks/data/th/community_hellaswag_tha_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..bccd132ec42fee7cb7052ec8ad9c7228d6e30948 --- /dev/null +++ b/app/src/content/assets/finetasks/data/th/community_hellaswag_tha_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:a9731405a26692653ca08c1caea28689b8adc164da41d1495a8365b516daa6ec +size 18864 diff --git a/app/src/content/assets/finetasks/data/th/community_hellaswag_tha_cf_stats.csv b/app/src/content/assets/finetasks/data/th/community_hellaswag_tha_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..e78cdcb42d436ba54321525d36308c019ae50b19 --- /dev/null +++ b/app/src/content/assets/finetasks/data/th/community_hellaswag_tha_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:2ae7e104cac8839ab00953799e62a95c65dddc2fce5dd5e879cce43610908ead +size 929 diff --git a/app/src/content/assets/finetasks/data/th/m3exams_tha_cf_data.csv b/app/src/content/assets/finetasks/data/th/m3exams_tha_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..41f848f468cfbc6919d91273a5f9965d416090c3 --- /dev/null +++ b/app/src/content/assets/finetasks/data/th/m3exams_tha_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:bcd50ece85e38e3b2b4e9eb7421190de08a66efcf7324d82b25208b3b4f77000 +size 17425 diff --git a/app/src/content/assets/finetasks/data/th/m3exams_tha_cf_stats.csv b/app/src/content/assets/finetasks/data/th/m3exams_tha_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..ccbd0ac0bb974230ce0d68e7b59933e80ab38791 --- /dev/null +++ b/app/src/content/assets/finetasks/data/th/m3exams_tha_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:af1fba5ae71f4e97e042d59fd531339090f5ff44aadf99d53c012e39f3fb73d2 +size 875 diff --git a/app/src/content/assets/finetasks/data/th/meta_mmlu_tha_cf:_average_data.csv b/app/src/content/assets/finetasks/data/th/meta_mmlu_tha_cf:_average_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..184a03501ddb22bc76f3fc452b6f66a28f92d877 --- /dev/null +++ b/app/src/content/assets/finetasks/data/th/meta_mmlu_tha_cf:_average_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:5d7c6cc4e38c1b5afc755456381abefa000884e8d7c40258f0a0647a8fffdbd2 +size 24389 diff --git a/app/src/content/assets/finetasks/data/th/meta_mmlu_tha_cf:_average_stats.csv b/app/src/content/assets/finetasks/data/th/meta_mmlu_tha_cf:_average_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..ff658a6dc41b1360cba069ba6669725d869079ba --- /dev/null +++ b/app/src/content/assets/finetasks/data/th/meta_mmlu_tha_cf:_average_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:2be1b0033d86ccfe177cfb46d7b0d2ed96d4c62e6eae5bb04f1c47713da17876 +size 935 diff --git a/app/src/content/assets/finetasks/data/th/mkqa_tha:_average_data.csv b/app/src/content/assets/finetasks/data/th/mkqa_tha:_average_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..22fb8c54d3284b0da8dc9314a041513fc53088b9 --- /dev/null +++ b/app/src/content/assets/finetasks/data/th/mkqa_tha:_average_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:2f975bdd9598a214ea9a7fc46ebe9b9dde8e54a89f30bae6ef2a2df0d06ff466 +size 17479 diff --git a/app/src/content/assets/finetasks/data/th/mkqa_tha:_average_stats.csv b/app/src/content/assets/finetasks/data/th/mkqa_tha:_average_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..ab902eb66798ec3497ef0297d7c0b9de9fcd734f --- /dev/null +++ b/app/src/content/assets/finetasks/data/th/mkqa_tha:_average_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:99a99c34c1e1a737630a29c9020d2855af6417ed9443e11c28b1d0a147ecace3 +size 494 diff --git a/app/src/content/assets/finetasks/data/th/thai_exams_tha_cf:_average_data.csv b/app/src/content/assets/finetasks/data/th/thai_exams_tha_cf:_average_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..12ec8bde6e835191d3dd7841a5d2bf6fb22f31a0 --- /dev/null +++ b/app/src/content/assets/finetasks/data/th/thai_exams_tha_cf:_average_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:bcae6794cfc1abbf2cde7935f744880f114f6578215ffda39bdf624b191dde3c +size 24653 diff --git a/app/src/content/assets/finetasks/data/th/thai_exams_tha_cf:_average_stats.csv b/app/src/content/assets/finetasks/data/th/thai_exams_tha_cf:_average_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..553e68a5dceb9f0d58bfbab27ce067a115cab6bc --- /dev/null +++ b/app/src/content/assets/finetasks/data/th/thai_exams_tha_cf:_average_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:72e22c018e52ebd4b15f3301a6e286e6e57de48f9072b5179c795af89b2d4c52 +size 914 diff --git a/app/src/content/assets/finetasks/data/th/thai_exams_tha_cf:tgat_data.csv b/app/src/content/assets/finetasks/data/th/thai_exams_tha_cf:tgat_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..cc7f488b7dff0b32ae73030fec4128585adc66cf --- /dev/null +++ b/app/src/content/assets/finetasks/data/th/thai_exams_tha_cf:tgat_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:f7b7167156381c511ad0800a55f4e9cfd6de56a481640ea4e444cc25f6ba15bf +size 23053 diff --git a/app/src/content/assets/finetasks/data/th/thai_exams_tha_cf:tgat_stats.csv b/app/src/content/assets/finetasks/data/th/thai_exams_tha_cf:tgat_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..1281fcd9fa24b58c5a291e0cffb69c837ad07666 --- /dev/null +++ b/app/src/content/assets/finetasks/data/th/thai_exams_tha_cf:tgat_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:8058a0edc28158983b7bceb2efe2b04a958aac4973f0a784cc16a6f8e929ccb7 +size 903 diff --git a/app/src/content/assets/finetasks/data/th/thaiqa_tha_data.csv b/app/src/content/assets/finetasks/data/th/thaiqa_tha_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..fe73dcc56e09276fbe70383ad265b88c6629f962 --- /dev/null +++ b/app/src/content/assets/finetasks/data/th/thaiqa_tha_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:ac8710166a633a63e66f199b5973fe6bbe6da06cae0e885f02bdcefb4b8fbdb7 +size 15809 diff --git a/app/src/content/assets/finetasks/data/th/thaiqa_tha_stats.csv b/app/src/content/assets/finetasks/data/th/thaiqa_tha_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..c1875ab481417d187490c5f3b1e94c47a197410b --- /dev/null +++ b/app/src/content/assets/finetasks/data/th/thaiqa_tha_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:dd6bff823527d5727bee4f1601f5a9ee4e5779a2db649e95cbb9a34dd94b1fcb +size 466 diff --git a/app/src/content/assets/finetasks/data/th/wsci_tha_cf_data.csv b/app/src/content/assets/finetasks/data/th/wsci_tha_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..9f5ee77f74dce39b0a6376623e829c2d1f90313b --- /dev/null +++ b/app/src/content/assets/finetasks/data/th/wsci_tha_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:a9349637e6813bcdc5f2d65aed3617e0943eef8278c4eea3e85cfd9ca29622a7 +size 23144 diff --git a/app/src/content/assets/finetasks/data/th/wsci_tha_cf_stats.csv b/app/src/content/assets/finetasks/data/th/wsci_tha_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..da9f38f00ad57cbe24564a6afd3fedee3ef50c7e --- /dev/null +++ b/app/src/content/assets/finetasks/data/th/wsci_tha_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:6240447b715facaa807a2422270903ca05db3f4db2706c2cae53949e4d301792 +size 853 diff --git a/app/src/content/assets/finetasks/data/th/xcopa_tha_cf_data.csv b/app/src/content/assets/finetasks/data/th/xcopa_tha_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..bbb6f84690a74472419fd3a2e00d8dfbd57cac1f --- /dev/null +++ b/app/src/content/assets/finetasks/data/th/xcopa_tha_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:2b3a561c3a0f46053e885e1ee95c521621f8c937100a2bcba759e81db1d52005 +size 17797 diff --git a/app/src/content/assets/finetasks/data/th/xcopa_tha_cf_stats.csv b/app/src/content/assets/finetasks/data/th/xcopa_tha_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..e8823f4a041fe2785ca2740a2a73711c22ddb9ad --- /dev/null +++ b/app/src/content/assets/finetasks/data/th/xcopa_tha_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:62bd2f176b7ebed548e8008ed4696080a1d543499fb21553397027eed6cfc37f +size 863 diff --git a/app/src/content/assets/finetasks/data/th/xnli2.0_tha_cf_data.csv b/app/src/content/assets/finetasks/data/th/xnli2.0_tha_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..df1066393895890d159f1435c5bf66cdcc4430fc --- /dev/null +++ b/app/src/content/assets/finetasks/data/th/xnli2.0_tha_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:6c3fb870b7fc75407489ba5fd8a7f2103e1c369f85f8eb6586fab6f49f3cfff1 +size 16728 diff --git a/app/src/content/assets/finetasks/data/th/xnli2.0_tha_cf_stats.csv b/app/src/content/assets/finetasks/data/th/xnli2.0_tha_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..e1305f364be3b1a28445e395fa7fbfe000345edf --- /dev/null +++ b/app/src/content/assets/finetasks/data/th/xnli2.0_tha_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:831e3870c5aa7bcadbd6f440765c194d6e13a31177971733c6c2cab7bb2722c4 +size 846 diff --git a/app/src/content/assets/finetasks/data/th/xnli_tha_cf_data.csv b/app/src/content/assets/finetasks/data/th/xnli_tha_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..9b6b4ea0fecde8ec6069e6563c008d360a86988c --- /dev/null +++ b/app/src/content/assets/finetasks/data/th/xnli_tha_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:b0efd827082cf21c3a897a180df6c5f5d81883bb4dee1d7cf2a00a1298be3f86 +size 16355 diff --git a/app/src/content/assets/finetasks/data/th/xnli_tha_cf_stats.csv b/app/src/content/assets/finetasks/data/th/xnli_tha_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..9287be9f2a1e3c0cb61aeb4bc385f8d9eb1c44cb --- /dev/null +++ b/app/src/content/assets/finetasks/data/th/xnli_tha_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1b75efced7c75167a8bc178b70a07681f964c9421c8da5b3d3addf189c58ff4f +size 867 diff --git a/app/src/content/assets/finetasks/data/th/xquad_tha_data.csv b/app/src/content/assets/finetasks/data/th/xquad_tha_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..a80eb5f38faa7488adc8220f13b806e41b110dd7 --- /dev/null +++ b/app/src/content/assets/finetasks/data/th/xquad_tha_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:48b13687855c884b84e5cb32c6d3c9660beedc07affb41fc79fd01785e2ab2a4 +size 15842 diff --git a/app/src/content/assets/finetasks/data/th/xquad_tha_stats.csv b/app/src/content/assets/finetasks/data/th/xquad_tha_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..1b11f67df3dafc9e8a6414a83ddcd852ca802ace --- /dev/null +++ b/app/src/content/assets/finetasks/data/th/xquad_tha_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:f2359e425d5d9606ced4fea91c41f45756faf3e3246309fa8f3013713e9fe258 +size 479 diff --git a/app/src/content/assets/finetasks/data/tr/belebele_tur_Latn_cf_data.csv b/app/src/content/assets/finetasks/data/tr/belebele_tur_Latn_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..5350df31e6e9a421135069f473037184ddd70b17 --- /dev/null +++ b/app/src/content/assets/finetasks/data/tr/belebele_tur_Latn_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:278f70b9fb4c6caeae386bc3b6b02a0974d746b134876484dcd5dc27bc382b4c +size 21166 diff --git a/app/src/content/assets/finetasks/data/tr/belebele_tur_Latn_cf_stats.csv b/app/src/content/assets/finetasks/data/tr/belebele_tur_Latn_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..5097b3a5bb2a2f2217cad513c9a8bb3f42ac0d1a --- /dev/null +++ b/app/src/content/assets/finetasks/data/tr/belebele_tur_Latn_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:62acf26bb33ca8191a154f8b6dad2d545e7039cd71540782b9bb82e5c4f9d304 +size 782 diff --git a/app/src/content/assets/finetasks/data/tr/community_arc_tur_cf:easy_data.csv b/app/src/content/assets/finetasks/data/tr/community_arc_tur_cf:easy_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..3e2f2a78665e18dbb2eee7b4d1ece80425081990 --- /dev/null +++ b/app/src/content/assets/finetasks/data/tr/community_arc_tur_cf:easy_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:db74c4eadd2d0df7f74282287239cfdf3d187fe2828b1a9849ec5df45f0ca2fe +size 17830 diff --git a/app/src/content/assets/finetasks/data/tr/community_arc_tur_cf:easy_stats.csv b/app/src/content/assets/finetasks/data/tr/community_arc_tur_cf:easy_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..bf33278b1b3a7601370bd1332f1c6fdb0b3a0a8c --- /dev/null +++ b/app/src/content/assets/finetasks/data/tr/community_arc_tur_cf:easy_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:158b44ad366de4b72118ffa276de460cfabbc6f00f7916f55b4646a622898e71 +size 903 diff --git a/app/src/content/assets/finetasks/data/tr/community_hellaswag_tur_cf_data.csv b/app/src/content/assets/finetasks/data/tr/community_hellaswag_tur_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..3e6f1f6752fbbda21493f8822643771ea5f6a5d7 --- /dev/null +++ b/app/src/content/assets/finetasks/data/tr/community_hellaswag_tur_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:e8491355258a0bffa59b4c6873d1395fd8b42f0d012ca7cca40dcd3d724aba30 +size 18280 diff --git a/app/src/content/assets/finetasks/data/tr/community_hellaswag_tur_cf_stats.csv b/app/src/content/assets/finetasks/data/tr/community_hellaswag_tur_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..14df13bed77c278d63e0693831731cbb99a13884 --- /dev/null +++ b/app/src/content/assets/finetasks/data/tr/community_hellaswag_tur_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:bb456c19a824928c197acf2ab5ece3d94dfea3e543585494e8923aac0a3f1ffa +size 925 diff --git a/app/src/content/assets/finetasks/data/tr/community_mmlu_tur_cf:_average_data.csv b/app/src/content/assets/finetasks/data/tr/community_mmlu_tur_cf:_average_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..cd038fab4477661e34bf1ec2c783de921a6f65b9 --- /dev/null +++ b/app/src/content/assets/finetasks/data/tr/community_mmlu_tur_cf:_average_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:97788b865ff30845bf22d27440e63a80db7019e5eb068f22483a19bb8a402206 +size 24131 diff --git a/app/src/content/assets/finetasks/data/tr/community_mmlu_tur_cf:_average_stats.csv b/app/src/content/assets/finetasks/data/tr/community_mmlu_tur_cf:_average_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..ec9211beb854b838bd6faf83fab94982b9bc7da8 --- /dev/null +++ b/app/src/content/assets/finetasks/data/tr/community_mmlu_tur_cf:_average_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:0ca865483a7f2f6e0c24cbdcb276b256e06197b3a248bfae85664eaa31e1c2a1 +size 947 diff --git a/app/src/content/assets/finetasks/data/tr/community_truthfulqa_tur_cf:mc1_data.csv b/app/src/content/assets/finetasks/data/tr/community_truthfulqa_tur_cf:mc1_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..57dacfe6a38ce86931d9b9f3bc404b27e7957605 --- /dev/null +++ b/app/src/content/assets/finetasks/data/tr/community_truthfulqa_tur_cf:mc1_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:76cc087bf3061714a35c4e422be63f78ec51a6f51202d08339b13892b08356ce +size 24417 diff --git a/app/src/content/assets/finetasks/data/tr/community_truthfulqa_tur_cf:mc1_stats.csv b/app/src/content/assets/finetasks/data/tr/community_truthfulqa_tur_cf:mc1_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..5e2259ec9bfbf8092a3578a93b90573e5fc2044f --- /dev/null +++ b/app/src/content/assets/finetasks/data/tr/community_truthfulqa_tur_cf:mc1_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:8d5e4e6829a3476e15c6647d1725996083e0bda1bedff443572e1a2196dfc4dc +size 946 diff --git a/app/src/content/assets/finetasks/data/tr/community_truthfulqa_tur_cf:mc2_data.csv b/app/src/content/assets/finetasks/data/tr/community_truthfulqa_tur_cf:mc2_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..096c1ef4638be4455ede8819c6c23c47a21438ca --- /dev/null +++ b/app/src/content/assets/finetasks/data/tr/community_truthfulqa_tur_cf:mc2_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:b5813994523a91ff291297881cb4be34b412755ef661a346fe76d8b3ca146355 +size 24825 diff --git a/app/src/content/assets/finetasks/data/tr/community_truthfulqa_tur_cf:mc2_stats.csv b/app/src/content/assets/finetasks/data/tr/community_truthfulqa_tur_cf:mc2_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..c546a3c64b829c702331170e42999e4994f1a657 --- /dev/null +++ b/app/src/content/assets/finetasks/data/tr/community_truthfulqa_tur_cf:mc2_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:e320b8822eca04ce3da53de413ca47e4793a76d055b9976bc68d439feebf0aad +size 935 diff --git a/app/src/content/assets/finetasks/data/tr/community_xwinograd_tur_cf_data.csv b/app/src/content/assets/finetasks/data/tr/community_xwinograd_tur_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..ebcbdefb16d94a77e843036bd72b4928673cca45 --- /dev/null +++ b/app/src/content/assets/finetasks/data/tr/community_xwinograd_tur_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:660df6e34912aaa3f942e4ee5f7dd407628e693a35662d6120d15f0eb194c927 +size 18861 diff --git a/app/src/content/assets/finetasks/data/tr/community_xwinograd_tur_cf_stats.csv b/app/src/content/assets/finetasks/data/tr/community_xwinograd_tur_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..84e02355b4c6ee8808bff605c2eedf909269742e --- /dev/null +++ b/app/src/content/assets/finetasks/data/tr/community_xwinograd_tur_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:5621e38cf6f4eeb905c390e806ecdcdc8aa8a4a3e793a2036937241ade707ca3 +size 932 diff --git a/app/src/content/assets/finetasks/data/tr/exams_tur_cf:_average_data.csv b/app/src/content/assets/finetasks/data/tr/exams_tur_cf:_average_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..6a9eca3b95b83f66e311d0cc3ca83a5f606001e4 --- /dev/null +++ b/app/src/content/assets/finetasks/data/tr/exams_tur_cf:_average_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:dce8f5f93dda7e2f81bf7528cdca8c9ce8e4075ea20c11d89c09f39eb19e20f1 +size 24130 diff --git a/app/src/content/assets/finetasks/data/tr/exams_tur_cf:_average_stats.csv b/app/src/content/assets/finetasks/data/tr/exams_tur_cf:_average_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..62272eab4fe8d05e70d3125da41f42fc6e4c660e --- /dev/null +++ b/app/src/content/assets/finetasks/data/tr/exams_tur_cf:_average_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:08b1cd9d21dd4d4cdf60f14f409d80d62e7b5549c0e602aed3e9f02f2ecb9bd2 +size 904 diff --git a/app/src/content/assets/finetasks/data/tr/mkqa_tur:_average_data.csv b/app/src/content/assets/finetasks/data/tr/mkqa_tur:_average_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..bac19177c79a77f116f26f10341a15b36131e91f --- /dev/null +++ b/app/src/content/assets/finetasks/data/tr/mkqa_tur:_average_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:352d479a7cd7baf60bf42f45045546ccdf186527b30dbb35c8eb65e6fcbd2840 +size 17423 diff --git a/app/src/content/assets/finetasks/data/tr/mkqa_tur:_average_stats.csv b/app/src/content/assets/finetasks/data/tr/mkqa_tur:_average_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..a1eefc9a34b94a715c2e0474574f0dc3049a76a1 --- /dev/null +++ b/app/src/content/assets/finetasks/data/tr/mkqa_tur:_average_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1375ca1e07dba6e8320e0c4a28b64680b821cff68531157c580482949a2fb90b +size 499 diff --git a/app/src/content/assets/finetasks/data/tr/tquadv2_tur_data.csv b/app/src/content/assets/finetasks/data/tr/tquadv2_tur_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..5ada254988e2f12bea75b7968195b4c21802c5a3 --- /dev/null +++ b/app/src/content/assets/finetasks/data/tr/tquadv2_tur_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:e2b3e1d4584e61f8aa0ad9fd20413eafecbc6139a5ed9d990b3936a9b7e99342 +size 15819 diff --git a/app/src/content/assets/finetasks/data/tr/tquadv2_tur_stats.csv b/app/src/content/assets/finetasks/data/tr/tquadv2_tur_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..6da363fb161722e7694e2d9cf10c98d93332990e --- /dev/null +++ b/app/src/content/assets/finetasks/data/tr/tquadv2_tur_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:9a58d64f460976e7a350f85eb7d8fba2097c840e816962e08b6bdbf548f4e9dd +size 484 diff --git a/app/src/content/assets/finetasks/data/tr/xcopa_tur_cf_data.csv b/app/src/content/assets/finetasks/data/tr/xcopa_tur_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..efbc3b4a55163cafaa2e0fd58844766b93f4bebf --- /dev/null +++ b/app/src/content/assets/finetasks/data/tr/xcopa_tur_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:638570a564d7349b5e82ae241be5d831e7bf2e0373bfe60c874699a989ad60b9 +size 17371 diff --git a/app/src/content/assets/finetasks/data/tr/xcopa_tur_cf_stats.csv b/app/src/content/assets/finetasks/data/tr/xcopa_tur_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..9c167300b7e886400a3db9ce49936df12b2b8713 --- /dev/null +++ b/app/src/content/assets/finetasks/data/tr/xcopa_tur_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:fed3d810953dd8ae2213f843d2361596629cae5e38a27f22bbe820119469afa4 +size 873 diff --git a/app/src/content/assets/finetasks/data/tr/xnli2.0_tur_cf_data.csv b/app/src/content/assets/finetasks/data/tr/xnli2.0_tur_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..d73f9b22ebbe0b3f6bfc6d9701e1b54192e8436d --- /dev/null +++ b/app/src/content/assets/finetasks/data/tr/xnli2.0_tur_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:d80ca6f5bef9c5783b36cba5430d5307656624ff4d66182d578492e33a92c6eb +size 18026 diff --git a/app/src/content/assets/finetasks/data/tr/xnli2.0_tur_cf_stats.csv b/app/src/content/assets/finetasks/data/tr/xnli2.0_tur_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..757459a161742a8cf32b35f545a3c65cc3ce1f05 --- /dev/null +++ b/app/src/content/assets/finetasks/data/tr/xnli2.0_tur_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:ef4d93199215ad6df5939d3f989478349ca446e42da6d84ea4a1b75098eb2a2f +size 874 diff --git a/app/src/content/assets/finetasks/data/tr/xnli_tur_cf_data.csv b/app/src/content/assets/finetasks/data/tr/xnli_tur_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..c775864b8c0a4a81d6d73e014a7e6013d6b613e6 --- /dev/null +++ b/app/src/content/assets/finetasks/data/tr/xnli_tur_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:789834ab4ae38a8c6a3fd2720b697a3d8902a62c6028273632fc92b686af06b5 +size 17795 diff --git a/app/src/content/assets/finetasks/data/tr/xnli_tur_cf_stats.csv b/app/src/content/assets/finetasks/data/tr/xnli_tur_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..6d386425010df150de1b4861cc30477db01711e2 --- /dev/null +++ b/app/src/content/assets/finetasks/data/tr/xnli_tur_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1d0140d8b340e6511535dc26adc38257672c023b1a49f881799e68320976b7fe +size 854 diff --git a/app/src/content/assets/finetasks/data/tr/xquad_tur_data.csv b/app/src/content/assets/finetasks/data/tr/xquad_tur_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..e01ecb1f87d26c604a3d9309d6b9ea941ccaf02a --- /dev/null +++ b/app/src/content/assets/finetasks/data/tr/xquad_tur_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:d027c98af7825aa044b0257f507746ada84d4a55012ee256e5039dc541e6118d +size 15828 diff --git a/app/src/content/assets/finetasks/data/tr/xquad_tur_stats.csv b/app/src/content/assets/finetasks/data/tr/xquad_tur_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..09a659387b14caa44471a140d2e6fed24b37c792 --- /dev/null +++ b/app/src/content/assets/finetasks/data/tr/xquad_tur_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c20f2514458252bb10dd6ad72b1f247c21eef5922a6b1bd7c33bf2384fc74571 +size 484 diff --git a/app/src/content/assets/finetasks/data/zh/agieval_zho_cf:_average_data.csv b/app/src/content/assets/finetasks/data/zh/agieval_zho_cf:_average_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..6baee338768604e7ea1afc45ce96e1f95ecd6bfe --- /dev/null +++ b/app/src/content/assets/finetasks/data/zh/agieval_zho_cf:_average_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7122c6aba9fe4088e1ed62a37af1193659bd4b6a04fe20ece8ef9d7db09c5a80 +size 24164 diff --git a/app/src/content/assets/finetasks/data/zh/agieval_zho_cf:_average_stats.csv b/app/src/content/assets/finetasks/data/zh/agieval_zho_cf:_average_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..e7abe3e92b4a8d0fc11cf8ec2880d0df658ac8db --- /dev/null +++ b/app/src/content/assets/finetasks/data/zh/agieval_zho_cf:_average_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7fe8a6fdabd286baf7cf4d9476aafe45e4a5354516b1ee434bd5ade5cce4b50d +size 916 diff --git a/app/src/content/assets/finetasks/data/zh/belebele_zho_Hans_cf_data.csv b/app/src/content/assets/finetasks/data/zh/belebele_zho_Hans_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..6ae767d342d3739e0f0ad7f044562e2fc4c29fd6 --- /dev/null +++ b/app/src/content/assets/finetasks/data/zh/belebele_zho_Hans_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:ab32ae441c6910fca2806c691be148265e589fc59f753bacd5e9ba003384ec83 +size 21342 diff --git a/app/src/content/assets/finetasks/data/zh/belebele_zho_Hans_cf_stats.csv b/app/src/content/assets/finetasks/data/zh/belebele_zho_Hans_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..ce370f5d44c8a98d40da742516175d8bfbdd6199 --- /dev/null +++ b/app/src/content/assets/finetasks/data/zh/belebele_zho_Hans_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:38bcda1ee873981050eeaa4d90d2e5a4154c9ffd62e2fafdc892af99a7520871 +size 781 diff --git a/app/src/content/assets/finetasks/data/zh/c3_zho_cf_data.csv b/app/src/content/assets/finetasks/data/zh/c3_zho_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..9216aa48d7355756af8919eb0f6e4e980ed5b0d8 --- /dev/null +++ b/app/src/content/assets/finetasks/data/zh/c3_zho_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:9dc5e13fcc90c6bd6d1c8406767b5e73329036b25b49d35784173385e2ba8ffd +size 18177 diff --git a/app/src/content/assets/finetasks/data/zh/c3_zho_cf_stats.csv b/app/src/content/assets/finetasks/data/zh/c3_zho_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..3d3ed79ee154f864dbd308d8f71f1710fbb002f5 --- /dev/null +++ b/app/src/content/assets/finetasks/data/zh/c3_zho_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:b46ebb61b3e985973ac02181f64344019955eb0473e412e88b932c5fcde397a2 +size 858 diff --git a/app/src/content/assets/finetasks/data/zh/ceval_zho_cf:_average_data.csv b/app/src/content/assets/finetasks/data/zh/ceval_zho_cf:_average_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..75e29b12e5d8c8b48ec7def7505f7dd6c26b7364 --- /dev/null +++ b/app/src/content/assets/finetasks/data/zh/ceval_zho_cf:_average_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:411abd6894d05edc95d1e625d451824654926bc875278a119a0ed0c2b2d9660e +size 24130 diff --git a/app/src/content/assets/finetasks/data/zh/ceval_zho_cf:_average_stats.csv b/app/src/content/assets/finetasks/data/zh/ceval_zho_cf:_average_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..924abd31eb28c2c8cea5b388bbf83da204ffa720 --- /dev/null +++ b/app/src/content/assets/finetasks/data/zh/ceval_zho_cf:_average_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:4ccc235be2e06222e8eaf67aa735a2d679d7355b62084717aa3a0284bc71bd28 +size 885 diff --git a/app/src/content/assets/finetasks/data/zh/chinese_squad_zho_data.csv b/app/src/content/assets/finetasks/data/zh/chinese_squad_zho_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..259b9804a9aa863d9cb7c05e866f44986fd51433 --- /dev/null +++ b/app/src/content/assets/finetasks/data/zh/chinese_squad_zho_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:21714ca036eda60f51a1386ef7d7fc81853d403c8eaa9b69dd7df6433d965124 +size 15708 diff --git a/app/src/content/assets/finetasks/data/zh/chinese_squad_zho_stats.csv b/app/src/content/assets/finetasks/data/zh/chinese_squad_zho_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..c25a59278cbf6ecd05a2d4dcc0b9f05ea5060ad1 --- /dev/null +++ b/app/src/content/assets/finetasks/data/zh/chinese_squad_zho_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:37250ca029d98d12d9819fdcb89edb5f95c3d3ae7a19a5fa5b88c08f82b0ffdf +size 497 diff --git a/app/src/content/assets/finetasks/data/zh/cmath_zho_cf_data.csv b/app/src/content/assets/finetasks/data/zh/cmath_zho_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..26e2afc733e0da172c3d54381b407455b52e6b25 --- /dev/null +++ b/app/src/content/assets/finetasks/data/zh/cmath_zho_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:0ecddf4a5fe4ffc1ba1894d2c5be484f13936ef516d9bf5b0f6bc2f5a4c93e6f +size 15734 diff --git a/app/src/content/assets/finetasks/data/zh/cmath_zho_cf_stats.csv b/app/src/content/assets/finetasks/data/zh/cmath_zho_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..1e6f61c40ef307bf62dbf0520878a02f71526a0e --- /dev/null +++ b/app/src/content/assets/finetasks/data/zh/cmath_zho_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:dfebb593ee1dbf1bb83f8ad3833b25acdf283ee40a562f4640c50fc4024d1ef8 +size 486 diff --git a/app/src/content/assets/finetasks/data/zh/cmmlu_zho_cf:_average_data.csv b/app/src/content/assets/finetasks/data/zh/cmmlu_zho_cf:_average_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..a589090c3fb07c3998aa539cba86423acbd0c560 --- /dev/null +++ b/app/src/content/assets/finetasks/data/zh/cmmlu_zho_cf:_average_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c8862c892691c3a677e47d7020b79dad536d3e5db41c6f232028fd1320014fc4 +size 24016 diff --git a/app/src/content/assets/finetasks/data/zh/cmmlu_zho_cf:_average_stats.csv b/app/src/content/assets/finetasks/data/zh/cmmlu_zho_cf:_average_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..d84b9f085d8a0b476c1b5143078e1b3cbb0b24d7 --- /dev/null +++ b/app/src/content/assets/finetasks/data/zh/cmmlu_zho_cf:_average_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:4e3e95807c42bdc295cad807773eca8dc62acd46c287e42044d3c08efbf77e9d +size 896 diff --git a/app/src/content/assets/finetasks/data/zh/cmnli_zho_cf_data.csv b/app/src/content/assets/finetasks/data/zh/cmnli_zho_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..da40c8d1fe7745233d56d817ce73e0ca6f82817c --- /dev/null +++ b/app/src/content/assets/finetasks/data/zh/cmnli_zho_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:dc6f90e35bb68a6b6577baa7770ab53d0675af51832b71cc2a0b5db578202b32 +size 16281 diff --git a/app/src/content/assets/finetasks/data/zh/cmnli_zho_cf_stats.csv b/app/src/content/assets/finetasks/data/zh/cmnli_zho_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..d4a62e73b4f6dbc2541945f6b9f00484a6fa6f46 --- /dev/null +++ b/app/src/content/assets/finetasks/data/zh/cmnli_zho_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:af7e15660f180c66ba41784dd3dec8fe9bac8e166ad03b7ada4cf7b4e264f10c +size 831 diff --git a/app/src/content/assets/finetasks/data/zh/cmrc2018_zho_data.csv b/app/src/content/assets/finetasks/data/zh/cmrc2018_zho_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..1341ed308347253faa073c244b27b7a9561eab6e --- /dev/null +++ b/app/src/content/assets/finetasks/data/zh/cmrc2018_zho_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:771a9003cbc11f3e31a9f6ebb739a66f8e921d87f0283f1217eddd8d3307842f +size 17189 diff --git a/app/src/content/assets/finetasks/data/zh/cmrc2018_zho_stats.csv b/app/src/content/assets/finetasks/data/zh/cmrc2018_zho_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..0ecbd97fe317f5fd9db59c70014ddf8f86896c6d --- /dev/null +++ b/app/src/content/assets/finetasks/data/zh/cmrc2018_zho_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:12612733a2e31610d5d5704cfed8cb9307c27cbaa124f76417f98632f66e8946 +size 484 diff --git a/app/src/content/assets/finetasks/data/zh/m3exams_zho_cf_data.csv b/app/src/content/assets/finetasks/data/zh/m3exams_zho_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..bc632b4f877476d5ee4e9b5f34c0b30fa51e8774 --- /dev/null +++ b/app/src/content/assets/finetasks/data/zh/m3exams_zho_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:b65e5e95dd3d8cad30315d5262ef00d22d6ae4b567fb24b5c8ac459b14a23904 +size 23926 diff --git a/app/src/content/assets/finetasks/data/zh/m3exams_zho_cf_stats.csv b/app/src/content/assets/finetasks/data/zh/m3exams_zho_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..8de3e419bd05970bca32e62c6d27a44dcd8ef731 --- /dev/null +++ b/app/src/content/assets/finetasks/data/zh/m3exams_zho_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c9d7bc1852e91f4e0f82c466b9bf39d270a3f86d4276f47fa77a317f1d14d4a8 +size 868 diff --git a/app/src/content/assets/finetasks/data/zh/mkqa_zho:_average_data.csv b/app/src/content/assets/finetasks/data/zh/mkqa_zho:_average_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..b8d91d4eb258ec0dfeed79ad3fda1f488392490c --- /dev/null +++ b/app/src/content/assets/finetasks/data/zh/mkqa_zho:_average_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:f981fb5b7a45b04b860adc6962084a1f5b421bb3b69d625703386d1436374623 +size 17844 diff --git a/app/src/content/assets/finetasks/data/zh/mkqa_zho:_average_stats.csv b/app/src/content/assets/finetasks/data/zh/mkqa_zho:_average_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..e1edbee277c6d37250e6b99dfa78b2931e5df48e --- /dev/null +++ b/app/src/content/assets/finetasks/data/zh/mkqa_zho:_average_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:d8426243e85a0cf4bb863e5542b902a3f98de921ef3447f6e8f4ef9a2ff82870 +size 479 diff --git a/app/src/content/assets/finetasks/data/zh/mlmm_arc_zho_cf:challenge_data.csv b/app/src/content/assets/finetasks/data/zh/mlmm_arc_zho_cf:challenge_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..c484788d0207477a2953ab2afb180a98097ae279 --- /dev/null +++ b/app/src/content/assets/finetasks/data/zh/mlmm_arc_zho_cf:challenge_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:2af6aad275b584ff1bcfbbd5649dd2cacc59fd6a5f6a6965490def74cdbc9313 +size 18061 diff --git a/app/src/content/assets/finetasks/data/zh/mlmm_arc_zho_cf:challenge_stats.csv b/app/src/content/assets/finetasks/data/zh/mlmm_arc_zho_cf:challenge_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..0191ee3b773d7054b54ed6be3f26703229a01521 --- /dev/null +++ b/app/src/content/assets/finetasks/data/zh/mlmm_arc_zho_cf:challenge_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:52330e5defb07956b153b04b70987456e8af2150d8d54814f0563239b8f92cc7 +size 925 diff --git a/app/src/content/assets/finetasks/data/zh/mlmm_hellaswag_zho_cf_data.csv b/app/src/content/assets/finetasks/data/zh/mlmm_hellaswag_zho_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..414aa183fd02b5d7ca8a1e327eadbdda383ba98c --- /dev/null +++ b/app/src/content/assets/finetasks/data/zh/mlmm_hellaswag_zho_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:70d214efe058215c8f76d75ed4352087a56a015836711789f774728ce5e20646 +size 17439 diff --git a/app/src/content/assets/finetasks/data/zh/mlmm_hellaswag_zho_cf_stats.csv b/app/src/content/assets/finetasks/data/zh/mlmm_hellaswag_zho_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..4445f70766e814a7701e569526eb3b9ace29d17b --- /dev/null +++ b/app/src/content/assets/finetasks/data/zh/mlmm_hellaswag_zho_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:34f08e06dc180d2f569d9a2bc45baadfedd7fec6ed6a2e1b5b17e023e9236d7f +size 890 diff --git a/app/src/content/assets/finetasks/data/zh/mlmm_mmlu_zho_cf:_average_data.csv b/app/src/content/assets/finetasks/data/zh/mlmm_mmlu_zho_cf:_average_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..43228a1fdddb955ed4eff672a6ae9d1d21381657 --- /dev/null +++ b/app/src/content/assets/finetasks/data/zh/mlmm_mmlu_zho_cf:_average_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:96641a947d538a9b3d1d18c2e23a7a3b385c3b64d69f3064591dcfe3da2d949e +size 24153 diff --git a/app/src/content/assets/finetasks/data/zh/mlmm_mmlu_zho_cf:_average_stats.csv b/app/src/content/assets/finetasks/data/zh/mlmm_mmlu_zho_cf:_average_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..fe72f7a413279dcc476c624b3c48c6511e56bf44 --- /dev/null +++ b/app/src/content/assets/finetasks/data/zh/mlmm_mmlu_zho_cf:_average_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:6b8bf8ef54e1c9fd1210242d6be7ef4ff8a96b063b97ef7780a8f657ee287450 +size 932 diff --git a/app/src/content/assets/finetasks/data/zh/mlmm_truthfulqa_zho_cf:mc1_data.csv b/app/src/content/assets/finetasks/data/zh/mlmm_truthfulqa_zho_cf:mc1_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..25837c6e8e60d754a2697c556757bb5ae5acf3ec --- /dev/null +++ b/app/src/content/assets/finetasks/data/zh/mlmm_truthfulqa_zho_cf:mc1_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7e63f463601c023e5dda2b33a2760dd15edb99bd93fdb7e2c5b8310c33b1d7e7 +size 24326 diff --git a/app/src/content/assets/finetasks/data/zh/mlmm_truthfulqa_zho_cf:mc1_stats.csv b/app/src/content/assets/finetasks/data/zh/mlmm_truthfulqa_zho_cf:mc1_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..059fa5bc5165d7b24bb83e8af0caa4ebb8bf5a9f --- /dev/null +++ b/app/src/content/assets/finetasks/data/zh/mlmm_truthfulqa_zho_cf:mc1_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:ed99c80a32268118af703a33640d3c32fac50f9c7ead99d168c0e1cf2580c217 +size 916 diff --git a/app/src/content/assets/finetasks/data/zh/mlmm_truthfulqa_zho_cf:mc2_data.csv b/app/src/content/assets/finetasks/data/zh/mlmm_truthfulqa_zho_cf:mc2_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..51b2fe1bf19df35f095baf66f53c325c6f5d9e0d --- /dev/null +++ b/app/src/content/assets/finetasks/data/zh/mlmm_truthfulqa_zho_cf:mc2_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:09bf2a828bda79ab0d3c8198c7bc41a30c390d9e751905656778ac9bbbbd3b1b +size 24770 diff --git a/app/src/content/assets/finetasks/data/zh/mlmm_truthfulqa_zho_cf:mc2_stats.csv b/app/src/content/assets/finetasks/data/zh/mlmm_truthfulqa_zho_cf:mc2_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..39b3c8d07c57b6b6f6f0534a4752402b3404d52f --- /dev/null +++ b/app/src/content/assets/finetasks/data/zh/mlmm_truthfulqa_zho_cf:mc2_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:09960448e264ea4a24b7705b907a82baf73e7a8ddb39aaeed800aaf9ca64c48e +size 937 diff --git a/app/src/content/assets/finetasks/data/zh/ocnli_zho_cf_data.csv b/app/src/content/assets/finetasks/data/zh/ocnli_zho_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..ba70c95436706b3ba14bd35c1f85d5c9f73a8144 --- /dev/null +++ b/app/src/content/assets/finetasks/data/zh/ocnli_zho_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:8e594290ae7c12d13c6007d0e3b555c7715aa45c5ef258dd86312f47545d45d8 +size 17116 diff --git a/app/src/content/assets/finetasks/data/zh/ocnli_zho_cf_stats.csv b/app/src/content/assets/finetasks/data/zh/ocnli_zho_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..6b014522917baa2a329861297d78a1ee61fac81a --- /dev/null +++ b/app/src/content/assets/finetasks/data/zh/ocnli_zho_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:311ca5949c969c3864fe2eee1011aae4b50d3e392a2bc759bf4820fe09ac4c83 +size 863 diff --git a/app/src/content/assets/finetasks/data/zh/pawsx_zho_cf_data.csv b/app/src/content/assets/finetasks/data/zh/pawsx_zho_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..673a7485350580156018abba1f0799349c701b8d --- /dev/null +++ b/app/src/content/assets/finetasks/data/zh/pawsx_zho_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:0cb2a1cb2400b35108a08d6dd465689f33e08f9ed1a83b4030c6b3db0ed66571 +size 18690 diff --git a/app/src/content/assets/finetasks/data/zh/pawsx_zho_cf_stats.csv b/app/src/content/assets/finetasks/data/zh/pawsx_zho_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..a009c5e80393facb928ee6d7b4bb20fc750cd2cc --- /dev/null +++ b/app/src/content/assets/finetasks/data/zh/pawsx_zho_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:e59122f2374405138fe665a9217dd597f2d60ba66059b2d4b5593aa88326253e +size 874 diff --git a/app/src/content/assets/finetasks/data/zh/xcodah_zho_cf_data.csv b/app/src/content/assets/finetasks/data/zh/xcodah_zho_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..f989a043da0d91aafaab3ac03dfa63697e92de61 --- /dev/null +++ b/app/src/content/assets/finetasks/data/zh/xcodah_zho_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:ee96fc5ea3f856d9692baf00fb47cf5a5b07456ded1333704d772ea5e6317f31 +size 24124 diff --git a/app/src/content/assets/finetasks/data/zh/xcodah_zho_cf_stats.csv b/app/src/content/assets/finetasks/data/zh/xcodah_zho_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..dd2462adb32abdd90d6223581f3cccd7254e299f --- /dev/null +++ b/app/src/content/assets/finetasks/data/zh/xcodah_zho_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:97851b7aa4df5060deec039734375d4d8d66e44f1b475b478e1b4fc3d4250e84 +size 859 diff --git a/app/src/content/assets/finetasks/data/zh/xcopa_zho_cf_data.csv b/app/src/content/assets/finetasks/data/zh/xcopa_zho_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..527206ff547307cbb353969c089da7bedd5fd191 --- /dev/null +++ b/app/src/content/assets/finetasks/data/zh/xcopa_zho_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:bb6f17881296186afaecc6968657270b75a2035891f39ba6a121eebbc3dd8dce +size 17603 diff --git a/app/src/content/assets/finetasks/data/zh/xcopa_zho_cf_stats.csv b/app/src/content/assets/finetasks/data/zh/xcopa_zho_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..c9b4c42bf29073bed127cf17d33ff45a6100ad0c --- /dev/null +++ b/app/src/content/assets/finetasks/data/zh/xcopa_zho_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:205df9f85b963e3ddbed27d790aa76e9691c40276f31d0422bc4b5476617a773 +size 861 diff --git a/app/src/content/assets/finetasks/data/zh/xcsqa_zho_cf_data.csv b/app/src/content/assets/finetasks/data/zh/xcsqa_zho_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..cc02f16955dbbf09c50586deeff839542e5d2e01 --- /dev/null +++ b/app/src/content/assets/finetasks/data/zh/xcsqa_zho_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c181fd0808f93ee9ce45dcd568a01f706cf580af941c43c7dcfc4c1fed123f85 +size 17983 diff --git a/app/src/content/assets/finetasks/data/zh/xcsqa_zho_cf_stats.csv b/app/src/content/assets/finetasks/data/zh/xcsqa_zho_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..7b90f15d3657ec0ddb29ed59067b1c0e98f0f8ae --- /dev/null +++ b/app/src/content/assets/finetasks/data/zh/xcsqa_zho_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:ecfe10919a740e64eb4967bb8186b1ee8b99b33f21ace0b619720fcd925bd2b1 +size 868 diff --git a/app/src/content/assets/finetasks/data/zh/xnli2.0_zho_cf_data.csv b/app/src/content/assets/finetasks/data/zh/xnli2.0_zho_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..4d7993ad1e0c9663e5b7d4e7e844947b2d8c2571 --- /dev/null +++ b/app/src/content/assets/finetasks/data/zh/xnli2.0_zho_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:ef688afd5e241ebe8574990a610daa8de7fdb6399d3146fd98c403c3a831fb01 +size 16452 diff --git a/app/src/content/assets/finetasks/data/zh/xnli2.0_zho_cf_stats.csv b/app/src/content/assets/finetasks/data/zh/xnli2.0_zho_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..f4528b9976eba2e7f369de85501f19803e54fdd8 --- /dev/null +++ b/app/src/content/assets/finetasks/data/zh/xnli2.0_zho_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:82a10c02d01db00ff4b9dc60c1d2c0bab85d225f1653815c35de70d20c9b2a4f +size 883 diff --git a/app/src/content/assets/finetasks/data/zh/xnli_zho_cf_data.csv b/app/src/content/assets/finetasks/data/zh/xnli_zho_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..145fd33cc8264c547e04e36f6f838e50268615f0 --- /dev/null +++ b/app/src/content/assets/finetasks/data/zh/xnli_zho_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:402b91bf737da5b57b33b929272a09f46d838ebff8ff4c515f95bb3284c84e0f +size 16384 diff --git a/app/src/content/assets/finetasks/data/zh/xnli_zho_cf_stats.csv b/app/src/content/assets/finetasks/data/zh/xnli_zho_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..3b79abd0ac6ad16b7749a0220b1422efdb404df3 --- /dev/null +++ b/app/src/content/assets/finetasks/data/zh/xnli_zho_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:a87eb625c2f9aa6546ff689553063e15e88255c6dfe0cb37593e0d81952d8d61 +size 861 diff --git a/app/src/content/assets/finetasks/data/zh/xquad_zho_data.csv b/app/src/content/assets/finetasks/data/zh/xquad_zho_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..8fe57050051113c380b260b9970671944176cb83 --- /dev/null +++ b/app/src/content/assets/finetasks/data/zh/xquad_zho_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:29c6405d5051c4037e07b5da2d076270b394f731962fb227d7c01952c3771511 +size 15782 diff --git a/app/src/content/assets/finetasks/data/zh/xquad_zho_stats.csv b/app/src/content/assets/finetasks/data/zh/xquad_zho_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..6f54530782b32775efd92db8424334221dcaae1a --- /dev/null +++ b/app/src/content/assets/finetasks/data/zh/xquad_zho_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:8709391e3320a895fd92a1d8d70a89ce1ce49901a309fe96717c3c904730daa4 +size 478 diff --git a/app/src/content/assets/finetasks/data/zh/xstory_cloze_zho_cf_data.csv b/app/src/content/assets/finetasks/data/zh/xstory_cloze_zho_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..dea0a8c53883fac2399a2ddf0e1abd2462abbec0 --- /dev/null +++ b/app/src/content/assets/finetasks/data/zh/xstory_cloze_zho_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:3e1b1521b94d38e91825e4af6fad645477ba8e3b2fbbaba2990095da96db50aa +size 17951 diff --git a/app/src/content/assets/finetasks/data/zh/xstory_cloze_zho_cf_stats.csv b/app/src/content/assets/finetasks/data/zh/xstory_cloze_zho_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..c37c9eff18f8dbd2553ba0cb11444a1e591f581d --- /dev/null +++ b/app/src/content/assets/finetasks/data/zh/xstory_cloze_zho_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:06beb26fee81cca6ba3f15a0e0037e062767011815abc829cb9c422d2d22768d +size 895 diff --git a/app/src/content/assets/finetasks/data/zh/xwinograd_zho_cf_data.csv b/app/src/content/assets/finetasks/data/zh/xwinograd_zho_cf_data.csv new file mode 100644 index 0000000000000000000000000000000000000000..b261d7b27960ed5fb655919755d069e05ee53ebc --- /dev/null +++ b/app/src/content/assets/finetasks/data/zh/xwinograd_zho_cf_data.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7db060c66d6cb3fb83dd1b3d9f28bc6043291afb5354d534cb4d6a9432257057 +size 24146 diff --git a/app/src/content/assets/finetasks/data/zh/xwinograd_zho_cf_stats.csv b/app/src/content/assets/finetasks/data/zh/xwinograd_zho_cf_stats.csv new file mode 100644 index 0000000000000000000000000000000000000000..4f7143b7dc64073bdf2618e21ff0b3ef2242cb93 --- /dev/null +++ b/app/src/content/assets/finetasks/data/zh/xwinograd_zho_cf_stats.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:0dfba2bb38da71eb3aaba7aab0d3e9c56fe22015ac99d46d3aad912807301c26 +size 881 diff --git a/app/src/content/assets/generated_figs/intro.d3 b/app/src/content/assets/generated_figs/intro.d3 new file mode 100644 index 0000000000000000000000000000000000000000..7e6ff5b1c2ed9bd564a2679c64fc64e8cb056ff1 --- /dev/null +++ b/app/src/content/assets/generated_figs/intro.d3 @@ -0,0 +1,111 @@ + + + + + + Intro Chart + + + + +
        + + + + \ No newline at end of file diff --git a/app/src/content/assets/image/best_annotation_practices.png b/app/src/content/assets/image/best_annotation_practices.png new file mode 100644 index 0000000000000000000000000000000000000000..15f80e898dd2f3fab0d7f3ce17e4e646483df81b --- /dev/null +++ b/app/src/content/assets/image/best_annotation_practices.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:4aa15bb7d2335cb49749aed274aedbf7dd701c51b0bfac01e69998ddc5b1ddc5 +size 80814 diff --git a/app/src/content/assets/image/chat-templates-and-tokenisation.png b/app/src/content/assets/image/chat-templates-and-tokenisation.png new file mode 100644 index 0000000000000000000000000000000000000000..66553e6a4b6a38f724c9993299df4c1e4f805478 --- /dev/null +++ b/app/src/content/assets/image/chat-templates-and-tokenisation.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:0a3e4762ba6d5ecb79519b2533eb739af571c284a1cae8332a28e81906fe018c +size 209038 diff --git a/app/src/content/assets/image/env.png b/app/src/content/assets/image/env.png new file mode 100644 index 0000000000000000000000000000000000000000..3e125106c4bb21e471aaf7e6036a65c2f05c55d2 --- /dev/null +++ b/app/src/content/assets/image/env.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:fbcef0931fde4128efb8451e71e3c1bdbd663e2eb8e82f2bf982a3f048f376a7 +size 658712 diff --git a/app/src/content/assets/image/finevision.png b/app/src/content/assets/image/finevision.png new file mode 100644 index 0000000000000000000000000000000000000000..070c72c13927ddcbedd410d96739c23318881335 --- /dev/null +++ b/app/src/content/assets/image/finevision.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:e6c02ff75943442df6667b162c406c11acb5cacf219c09ab6e60774125e5fb8b +size 257529 diff --git a/app/src/content/assets/image/llm_gen.png b/app/src/content/assets/image/llm_gen.png new file mode 100644 index 0000000000000000000000000000000000000000..a076f56aca95248520d002208dfba3a32d39ef40 --- /dev/null +++ b/app/src/content/assets/image/llm_gen.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:624a840ff723d0f99fd56c91ee945467c8d7444c0dc5eb3f0b490f7852034e0e +size 240451 diff --git a/app/src/content/assets/image/llm_logprob.png b/app/src/content/assets/image/llm_logprob.png new file mode 100644 index 0000000000000000000000000000000000000000..0e5680d14895261958bf83ae513f93cd19e146d5 --- /dev/null +++ b/app/src/content/assets/image/llm_logprob.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:37641bcc109eb81115fb29b6849238a69402a8b56419771b52b4b5da4bfc5dbf +size 226280 diff --git a/app/src/content/assets/image/llm_tk_1.png b/app/src/content/assets/image/llm_tk_1.png new file mode 100644 index 0000000000000000000000000000000000000000..b05de7a2ae8ac873f653222761d1016e20dd937a --- /dev/null +++ b/app/src/content/assets/image/llm_tk_1.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c8f485a24f275440b10c21ac2e0a1a8b3f4c6652ee7b008d5737ee4bf82f7e3a +size 150289 diff --git a/app/src/content/assets/image/lm_eval_diff.png b/app/src/content/assets/image/lm_eval_diff.png new file mode 100644 index 0000000000000000000000000000000000000000..9883b558f07721fd3c91711825e5f297442d090a --- /dev/null +++ b/app/src/content/assets/image/lm_eval_diff.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:6e62f744dd23a6856c86b62aa36150061cc5c7a4ed85a65811e0ba757277f6de +size 174137 diff --git a/app/src/content/assets/image/maintain-the-unmaintainable.png b/app/src/content/assets/image/maintain-the-unmaintainable.png new file mode 100644 index 0000000000000000000000000000000000000000..522bb22668a65b20de25815a8dedd60176de8c18 --- /dev/null +++ b/app/src/content/assets/image/maintain-the-unmaintainable.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:6b265d8ee4ca1413cd3af59cbf307531e0d11af8402070d54757f15ea2032cdf +size 1168846 diff --git a/app/src/content/assets/image/mmlu_prompt.png b/app/src/content/assets/image/mmlu_prompt.png new file mode 100644 index 0000000000000000000000000000000000000000..4a247cfdef59f5c5dc7b118b65c90ffe479e287a --- /dev/null +++ b/app/src/content/assets/image/mmlu_prompt.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:a5563b9b68413b080ed2d221f359884bc1c1e1cbc3b95aaa7f21a4c508b9fd14 +size 67632 diff --git a/app/src/content/assets/image/placeholder-wide.png b/app/src/content/assets/image/placeholder-wide.png new file mode 100644 index 0000000000000000000000000000000000000000..0010420a4ede3b75de2154c4e1fd864a600a76f8 --- /dev/null +++ b/app/src/content/assets/image/placeholder-wide.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:200daf8ade0c7f035d883fefa9a12d6ba7cca504b1d5571774748c3c90639103 +size 34642 diff --git a/app/src/content/assets/image/placeholder.png b/app/src/content/assets/image/placeholder.png new file mode 100644 index 0000000000000000000000000000000000000000..1248e48424b9a83c2230f63e5db5344c9821eb39 --- /dev/null +++ b/app/src/content/assets/image/placeholder.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:af82403ec775e8ed0139a70d034e000cb567b6766a3187d246800f0384d57ea9 +size 677135 diff --git a/app/src/content/assets/image/smoll-training-guide.png b/app/src/content/assets/image/smoll-training-guide.png new file mode 100644 index 0000000000000000000000000000000000000000..19acd9b7f673d9b0dcd1163945943f3b5ef5f042 --- /dev/null +++ b/app/src/content/assets/image/smoll-training-guide.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:8a41f479bd3b922ddd723ccf80b7bebf3e8f875ee1049794d472c67e23f0cc12 +size 162677 diff --git a/app/src/content/assets/image/sympy_doc.png b/app/src/content/assets/image/sympy_doc.png new file mode 100644 index 0000000000000000000000000000000000000000..3827da5c8d75b758b8449c6e67da01b74f950dd1 --- /dev/null +++ b/app/src/content/assets/image/sympy_doc.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:4057d0a0dbf4edd640bd7bc1e9000580dcd1da00f1ed27f2b82132070fa243df +size 43043 diff --git a/app/src/content/assets/image/visual-vocabulary-poster.png b/app/src/content/assets/image/visual-vocabulary-poster.png new file mode 100644 index 0000000000000000000000000000000000000000..418527bd50ea464437b59626ad2f8e86dd8ce78a --- /dev/null +++ b/app/src/content/assets/image/visual-vocabulary-poster.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:62f72a7eeabc611d4b312c882589bae9369d49e39dd40e2d17e68c77399efc11 +size 915038 diff --git a/app/src/content/assets/sprites/font-sprite.svg b/app/src/content/assets/sprites/font-sprite.svg new file mode 100644 index 0000000000000000000000000000000000000000..9226d661bbdf56751872b7fb0efc7b807296df58 --- /dev/null +++ b/app/src/content/assets/sprites/font-sprite.svg @@ -0,0 +1,884 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + \ No newline at end of file diff --git a/app/src/content/bibliography.bib b/app/src/content/bibliography.bib new file mode 100644 index 0000000000000000000000000000000000000000..a44925e782e396dca9ff52f9c89acd4b2ee33c8f --- /dev/null +++ b/app/src/content/bibliography.bib @@ -0,0 +1,130 @@ +@inproceedings{vaswani2017attention, + title = {Attention Is All You Need}, + author = {Vaswani, Ashish and Shazeer, Noam and Parmar, Niki and Uszkoreit, Jakob and Jones, Llion and Gomez, Aidan N and Kaiser, { + }Lukasz and Polosukhin, Illia}, + booktitle = {Advances in Neural Information Processing Systems}, + year = {2017} +} + +@book{mckinney2017python, + title = {Python for Data Analysis}, + author = {McKinney, Wes}, + publisher = {O'Reilly Media}, + address = {Sebastopol, CA}, + year = {2017}, + edition = {2}, + isbn = {978-1491957660} +} + +@inproceedings{he2016resnet, + title = {Deep Residual Learning for Image Recognition}, + author = {He, Kaiming and Zhang, Xiangyu and Ren, Shaoqing and Sun, Jian}, + booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)}, + pages = {770--778}, + year = {2016}, + doi = {10.1109/CVPR.2016.90}, + url = {https://doi.org/10.1109/CVPR.2016.90} +} + +@article{silver2017mastering, + title = {Mastering the game of Go without human knowledge}, + author = {Silver, David and Schrittwieser, Julian and Simonyan, Karen and Antonoglou, Ioannis and Huang, Aja and others}, + journal = {Nature}, + volume = {550}, + number = {7676}, + pages = {354--359}, + year = {2017}, + month = {oct}, + doi = {10.1038/nature24270}, + url = {https://www.nature.com/articles/nature24270} +} + +@techreport{openai2023gpt4, + title = {GPT-4 Technical Report}, + author = {{OpenAI}}, + institution = {OpenAI}, + year = {2023}, + number = {arXiv:2303.08774}, + archiveprefix = {arXiv}, + eprint = {2303.08774}, + primaryclass = {cs.CL}, + url = {https://arxiv.org/abs/2303.08774} +} + +@phdthesis{doe2020thesis, + title = {Learning Efficient Representations for Large-Scale Visual Recognition}, + author = {Doe, Jane}, + school = {Massachusetts Institute of Technology}, + address = {Cambridge, MA}, + year = {2020}, + doi = {10.5555/mit-2020-xyz} +} + +@incollection{cover2006entropy, + title = {Entropy, Relative Entropy, and Mutual Information}, + author = {Cover, Thomas M. and Thomas, Joy A.}, + booktitle = {Elements of Information Theory}, + publisher = {Wiley}, + address = {Hoboken, NJ}, + edition = {2}, + year = {2006}, + pages = {13--55}, + isbn = {978-0471241959} +} + +@misc{zenodo2021dataset, + title = {ImageNet-21K Subset (Version 2.0)}, + author = {Smith, John and Lee, Alice and Kumar, Ravi}, + year = {2021}, + howpublished = {Dataset on Zenodo}, + doi = {10.5281/zenodo.1234567}, + url = {https://doi.org/10.5281/zenodo.1234567}, + note = {Accessed 2025-09-01} +} + +@misc{sklearn2024, + title = {scikit-learn: Machine Learning in Python (Version 1.4)}, + author = {Pedregosa, Fabian and Varoquaux, Ga{"e}l and Gramfort, Alexandre and others}, + year = {2024}, + howpublished = {Software}, + doi = {10.5281/zenodo.592264}, + url = {https://scikit-learn.org} +} + +@inproceedings{smith2024privacy, + title = {Privacy-Preserving Training with Low-Precision Secure Aggregation}, + author = {Smith, Emily and Zhang, Wei and Rossi, Marco and Patel, Neha}, + booktitle = {Proceedings of the 41st International Conference on Machine Learning}, + editor = {Smith, A. and Johnson, B.}, + series = {Proceedings of Machine Learning Research}, + volume = {235}, + pages = {12345--12367}, + address = {Vienna, Austria}, + publisher = {PMLR}, + month = {jul}, + year = {2024}, + url = {https://proceedings.mlr.press/v235/} +} + +@article{kingma2015adam, + title = {Adam: A Method for Stochastic Optimization}, + author = {Kingma, Diederik P. and Ba, Jimmy}, + journal = {International Conference on Learning Representations (ICLR)}, + year = {2015}, + archiveprefix = {arXiv}, + eprint = {1412.6980}, + primaryclass = {cs.LG}, + url = {https://arxiv.org/abs/1412.6980} +} + +@misc{raffel2020t5, + title = {Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer}, + author = {Raffel, Colin and Shazeer, Noam and Roberts, Adam and Lee, Katherine and Narang, Sharan and others}, + year = {2020}, + howpublished = {arXiv preprint}, + archiveprefix = {arXiv}, + eprint = {1910.10683}, + primaryclass = {cs.LG}, + doi = {10.48550/arXiv.1910.10683}, + url = {https://arxiv.org/abs/1910.10683} +} diff --git a/app/src/content/chapters/automated-benchmarks/designing-your-automatic-evaluation.mdx b/app/src/content/chapters/automated-benchmarks/designing-your-automatic-evaluation.mdx new file mode 100644 index 0000000000000000000000000000000000000000..9a50ee6bd741d78e56e7f0a868ab3fa6d667cd4a --- /dev/null +++ b/app/src/content/chapters/automated-benchmarks/designing-your-automatic-evaluation.mdx @@ -0,0 +1,572 @@ +--- +title: "Designing your automatic evaluation" +--- + +import Note from "../../../components/Note.astro"; +import Sidenote from "../../../components/Sidenote.astro"; +import HtmlEmbed from "../../../components/HtmlEmbed.astro"; +import Image from "../../../components/Image.astro"; +import UsingHumanAnnotators from "../human-evaluation/using-human-annotators.mdx"; +import envImage from '../../assets/image/env.png'; +import Wide from "../../../components/Wide.astro"; + +### Dataset + +#### Using existing data +You can use existing datasets are are, and change the prompting or metrics associated (as has been done for older evaluations to adapt them to new prompting method), but you can also aggregate datasets. + +Dataset aggregation is a good approach when you want to evaluate a specific capability that isn't well-covered by a single benchmark. Rather than starting from scratch, you can combine samples from multiple existing datasets to create a targeted evaluation suite. That's for examples what the authors of the "Measuring AGI" paper did recently to try to create a new "AGI evaluation" dataset. + +When aggregating datasets, pay attention to whether +- they contain redundant data (most mathematics datasets are rewrites or aggregations of the same initial problems) +- you need balanced representation across sources (you might not want one dataset to dominate and skew your evaluation) - this will also determine whether to aggregate scores across all samples or per subset +- formats and difficulty levels are compatible (typically, if creating a unified dataset, beware of mixing up samples requiring sampling or not). + +Examples: MMLU, Big-Bench (hundreds of diverse tasks), and HELM (combines multiple existing benchmarks for holistic evaluation) + + + +#### Creating a dataset synthetically +**Using rule-based techniques** + +If your task allows, using procedurally generated benchmarks is a very good way to get a virtually infinite supply of samples and avoid contamination! They can generate unlimited fresh test cases algorithmically, while controlling difficulty and enabling automatic verification, ensuring models haven't seen examples during training. + +For some examples, you can look at [NPHardEval](https://arxiv.org/abs/2312.14890), [DyVal](https://arxiv.org/abs/2309.17167), [MuSR](https://arxiv.org/abs/2310.16049), [BabiQA](https://arxiv.org/abs/1502.05698), [ZebraLogic](https://arxiv.org/pdf/2502.01100), IFEval, or GSMTemplate among others. **NPHardEval** generates complexity-grounded tasks like graph problems with automatic verification and monthly refreshes to reduce overfitting. **MuSR** creates complex reasoning instances like 1000-word murder mysteries using neurosymbolic generation. **ZebraLogic** algorithmically produces logic grid puzzles by generating solutions and iteratively minimizing clues using SAT solvers. **BabiQA** simulates entities following successions of actions. **IFEval** tests instruction-following with 500+ prompts containing verifiable constraints like word counts that can be checked programmatically. **GSM-Symbolic** uses templates to generate diverse math questions. + +Tasks which usually fit this paradigm test mathematical, logical, or coding abilities. + +**Creating synthetic data with models** + +If you want to create synthetic data, you usually start from a number of seed documents that will act as your ground truth. These can be internal and specific to your use cases, or available on the web and of high quality (like Wikipedia, Stack Overflow, ...). You'll then likely need to chunk your data into units of self contained meaning. + +You'll then likely want a model to design questions from your data. For this, you will need to select a frontier model, and design a very good prompt asking the model to create use-case relevant questions from the provided data. It's better if you ask the model to provide the source on which it based its question. + +You can also use seed prompts as examples to provide to an external model for it to write the prompt for your model to generate new questions, if you want to go full synthetic ^^ + +Once this is done, you can do an automatic validation by using a model from a different family line on your ground truth + questions + answer as a model judge. + + +No matter how tempting it is to do everything automatically, you should always check your data at every step, to make sure your evaluations are qualitative. Evaluation is the name of the game and you need to use extremely good data. + + +#### Managing contamination +In general, you should assume that a dataset publicly available on the internet is or will be contaminated. + +Solutions to mitigate this include: +- providing a **canary string** in the evaluation set (like in [BigBench](https://github.com/google/BIG-bench)): it is a specific character combination that model creators can look for in their training sets, which would indicate that it contains an evaluation +- providing evaluation sets in **[encrypted](https://arxiv.org/abs/2309.16575) or [gated](https://huggingface.co/datasets/Idavidrein/gpqa)** forms so that they can't be parsed easily by web crawlers - therefore not ending up accidentally in training sets +- running [dynamic benchmarks](https://arxiv.org/abs/2104.14337): benchmarks regularly updated through time so that models can't "learn the answers by heart" (but it makes datasets more costly) +- if you are running a benchmark, trying to [detect contamination](https://arxiv.org/abs/2311.06233) post-hoc (for example, by looking at the generation perplexity or designing adversarial versions of the prompts - however, no method is a foolproof contamination detection method) + +However, it's not because a dataset is contaminated that it won't still be interesting and have signal during training, as we saw in the ablations section. + + +A model which can only predict well on its training data (and has not latently learnt more high-level general patterns) is said to be **overfitting**. In less extreme cases, you still want to test if your model is able to generalize to data patterns which were not in the training set's distribution (for example, classify toxicity on stack overflow after having seen only toxicity on reddit). + + + +### Choosing a prompt +The prompt is going to define how much information is given to your model about the task, and how this information is presented to the model. It usually contains the following parts: an optional **task prompt** which introduces the task, and the format that the output should follow, **attached context** if needed (for example a source, an image), a **problem prompt** which is what you ask of the model, and optional options for multiple choice evaluations. + +When defining your prompt, you need to be aware that even small changes in semantically equivalent prompts can make the results vary by quite a lot, and prompt formats might advantage or disadvantage specific models (See [this section](https://huggingface.co/spaces/OpenEvals/evaluation-guidebook#different-prompt)). + +➡️ This can be mitigated by re-running the evaluation several times with prompt variations (but it can be costly), or simply running your evaluation once using a range of prompt formats allocated to different samples of equivalent difficulty. + +➡️ You can also provide examples to your model to help it follow the expected format (using few-shot examples), and adding connector words helps this overall. + +### Choosing an inference method for your model +You'll need to choose what kind of inference method you need. + + +Using log-probabilities is good for multiple choice question answers (MCQA), to test model knowledge, or ability to disambiguate. +- Pros: + - Makes sure that all models have access to the correct answer + - Provides a proxy for model "confidence" (and calibration) + - Fast to evaluate, especially when we ask the model to predict only one token (A/B/C/D the indices of the choices, or Yes/No, etc). + - Allow to get signal on small models' task performance +- Cons: + - Slightly over-scores small models which would have generated something outside of the range of available choices if given free rein. + - Some models [favor specific choices based on the order in which they have been presented](https://arxiv.org/abs/2309.03882), which could lead to unrepresentative evaluations (unless you're re-running the evaluation n times by shuffling samples orders, which you should do for significance if you have the budget for!) + + + +You can speed up your MCQA predictions by a lot if you make sure your model needs to predict only one token for the task. + +This way, instead of running your `number_of_choices` predictions (`context + choice 1`, `context + choice 2`, etc), you can simply run inference on `context` and compute the probability distribution on the full vocabulary (which will include all your one token choices) to get your logprobabilities of interest, and do this step in one pass. + + + +Nowadays most evaluations are generative: using generations is very good for any task where you want to test fluency, reasoning, or the ability of your model to actually answer questions. It's also the most relevant way to evaluate reasoning models. + +- Pros: + - Should actually correlates with LLM ability to generate fluent text, will most of the time be what people are actually interested in + - The only way to evaluate both closed and open source models +- Cons: + - Can be harder to score (see below) + - More expensive than log likelihood evaluations, especially if they include sampling or reasoning models + + +### Scoring + +If you are looking at **log-probabilities**, your metrics are going to be easy: you'll likely want to look at a variant of accuracy (how often the most likely choice is the best choice). It's important to normalize it by sequence length (either character, token, or pmi). You could also look at perplexity, recall, or f1 score. + +If you're looking at **generative** evaluations, this is where it gets trickyyy, so the next chapter is specifically on this! + +## Evaluation's main challenge: Scoring free form text + +Scoring free-form text is tricky because there are typically many different ways to express the same correct answer, making it hard to determine semantic equivalence through simple string matching, and output variations can make two semantically identical answers look completely different. Responses can be partially correct or contain a mix of accurate and inaccurate information. There can even be no single ground truth for the problem at hand, for example for tasks requiring to judge coherence, helpfulness, and style, which are inherently subjective and context-dependent. + +### Automatically + +When there is a ground truth, however, you can use automatic metrics, let's see how. + +#### Metrics +Most ways to automatically compare a string of text to a reference are match based. + +This is more interesting to do on data that was not included in the model training set, because you want to test if it **generalizes** well. You don't want a model which can only predict text it has already "seen", that would not be very useful! + +The easiest but least flexible match based metrics are **exact matches** of token sequences. While simple and unambiguous, they provide no partial credit - a prediction that's correct except for one word scores the same as one that's completely wrong. Be aware that "exact match" is used as a catch all name, and also includes "fuzzy matches" of strings: compared with normalization, on subsets of tokens (prefix only for ex), etc + +The translation and summarisation fields have introduced automatic metrics which compare similarity through overlap of n-grams in sequences. **BLEU** (Bilingual Evaluation Understudy) measures n-gram overlap with reference translations and remains widely used despite having a length bias toward shorter translations and correlating poorly with humans at the sentence level (it notably won't work well for predictions which are semantically equivalent but written in a different fashion than the reference). **ROUGE** does a similar thing but focuses more on recall-oriented n-gram overlap. A simpler version of these is the **TER** (translation error rate), number of edits required to go from a prediction to the correct reference (similar to an edit distance). +Lastly, you'll also find model-based metrics using embedding distances for similarity like **BLEURT** (it uses BERT-based learned representations trained on human judgments from WMT, providing better semantic understanding than n-gram methods, but requiring a model download and task-specific fine-tuning for optimal performance). +I'm introducing here the most well known metrics, but all of these metrics have variations and extensions, among which CorpusBLEU, GLEU, MAUVE, METEOR, to cite a few. + + + +Once you have an accuracy score per sample, you can **aggregate** it across your whole set in several ways. In general, people average their results, but you can do more complex things depending on your needs. (Some metrics already come with an aggregation, like CorpusBLEU). + +If your score is **binary**, look at the **precision** (critical when false positives are costly), **recall** (critical when missing positives is costly), **F1 score** (balances precision and recall, good for imbalanced data), or **MCC** (Matthews Correlation Coefficient, which works well with imbalanced datasets by considering all confusion matrix elements). +If your score is **continuous** (less likely though), you can use **mean squared error** (penalizes large errors but heavily weights outliers) or **mean absolute error** (more balanced than MSE). If you assume your data should follow a specific linear regression model (for example if you are studying model calibration), you can look at measures like the **R²** or correlation coefficients like **Pearson** (for linear relationships, assumes normality) or **Spearman** (for monotonic relationships without normality assumptions). However, it's a bit out of scope here. + +More generally, when picking your metric and its aggregation, you need to keep in mind what your task is really about. For some domains (ex: medical, chatbots with public interaction), you don't want to measure the average performance, but need a way to evaluate the **worst performance** you'll get (on medical quality of output, on toxicity, etc). + +- This [blog](https://ehudreiter.com/2024/07/10/challenges-in-evaluating-llms/) covers some of the challenges of evaluating LLMs. +- If you're looking for metrics, you'll also find a good list with description, score ranges and use cases in [this organisation](https://huggingface.co/evaluate-metric). + + + + +Automated benchmarks have the following advantages: +- **Consistency and reproducibility**: You can run the same automated benchmark 10 times on the same model and you'll get the same results (baring variations in hardware or inherent model randomness). This means that you can easily create fair rankings of models for a given task. +- **Scale at limited cost**: They are one of the cheapest way to evaluate models at the moment. +- **Understandability**: Most automated metrics are very understandable. + +However, they also present have a **reduced use on more complex tasks**: an automatic metric either requires you to have a perfect, unique and unambiguous reference/gold, like for tasks where performance is easy to define and assess (for example, classification of toxicity, knowledge questions with a single answer). More complex capabilities, on the other hand, are harder to decompose into a single and simple answer. + + +#### Normalization + +Normalization means changing a string of characters to have it fit a specific reference format. For example, when comparing a model prediction to a reference, you usually don't want to penalize extra spacing in the prediction, or added punctuation or capitalisation. That's why you normalize your prediction. + +They are vital for specific tasks, such as math evaluations, where you want to extract an equation from a longer prediction, and compare it to a reference. +In the below table, we make a list of some issues we saw happening when extracting predictions from model outputs using SymPy naively for the MATH dataset, and how Math-Verify, a specific math parser, solved these. + +| 📄 Example | ❗️Issue | ✅ Math-Verify | 🛑 Naive Approach | +| --- | --- | --- | --- | +| Therefore, the perimeter of one of these triangles is $14 + 7\sqrt{2}$ inches, expressed in simplest radical form. | Failed extraction | `7*sqrt(2) + 14` | None | +| Therefore, the sum of the infinite geometric series is \(\frac{7}{9}\). | Failed extraction | `7/9` | None | +| The final answer is $2x + 4y + z - 19 = 0$. I hope it is correct. | Partial parse of parametric eq | Eq(2*x + 4*y + z - 19, 0) | 0 | +| \(23\) | Failed extraction due to latex borders | `23` | None | +| \((- \infty, -14) \cup (-3, \infty)\). | Failed extraction due to interval | Union(Interval.open(-oo, -14), Interval.open(-3, oo)) | None | +| 100\% | Failed extraction due to invalid symbol | `1` | None | +| 1/3 == 0.333333 | No rounding support | True | False | +| sqrt(1/2)*7 == sqrt(0.5)*7 | No numerical evaluation support | True | False | + + +Look at [this blog](https://huggingface.co/blog/math_verify_leaderboard) for more details! + + +Normalizations can easily [be unfair if not designed well](https://huggingface.co/blog/open-llm-leaderboard-drop), but overall they still help provide signal at the task level. + +They are also be important for evaluation of predictions generated with chain of thought, or reasoning, as you'll need to remove the reasoning trace (which is not part of the final answer) from the output to get the actual answer. + +#### Sampling + +When models generate outputs, sampling multiple times and aggregating results can provide a more robust signal than a single greedy generation. +This is particularly important for complex reasoning tasks where models may arrive at correct answers through different paths. + +Common sampling-based metrics are: +- **pass@k over n**: Given n generated samples, measures whether at least k passes the test. You'll find two functions for pass@k: computed trivially as: $\text{pass}@k = (c >= k)$, or computed with an unbiased estimator with: $\text{pass}@k = 1 - \frac{\binom{n-c}{k}}{\binom{n}{k}}$ where c is the number of correct samples among n total samples. + +- **maj@n** (majority voting): Sample n generations and take the most frequent answer. This helps filter out spurious outputs and works particularly well when the model's correct reasoning path is more consistent than its errors. Commonly used for math and reasoning tasks. +- **cot@n** (chain-of-thought sampling): Sample n reasoning traces and evaluate them. Can be combined with majority voting or a pass@k (sample n reasoning chains, extract final answers, take majority or a threshold). +- **avg@n** (stable average score): Average the scores across n samples. It's a more stable estimator of performance than using "best" or "most common" case. + + + +When you use sampling evaluations, make sure to always report all sampling parameters (temperature, top-p, k value) as they significantly affect results. + + +- **For training evaluation/ablations**: ❌ Generally avoid sampling metrics as they're expensive and add variance. Stick to greedy decoding with a fixed seed. +- **For post-training evaluation**: ✅ Sampling metrics can reveal capabilities that greedy decoding misses (especially for more complex tasks requiring reasoning, math or code). +- **At inference**: ✅ These metrics help estimate how much improvement you can get from sampling multiple times at inference. It's particularly cool when you want to study how far you can push small models with test time compute. + +However, keep in mind that sampling k times multiplies your evaluation cost by k. For expensive models or large datasets, this adds up very quickly! + + +#### Functional scorers +Instead of comparing generated text to a reference through fuzzy string matching, functional testing evaluates whether outputs satisfy specific verifiable constraints. This approach is extremely promising because it's more flexible and allows "infinite" updates of the test case through rule-based generation (which reduces overfitting). + +**IFEval and IFBench** are excellent examples of this approach for instruction following evaluation. Rather than asking "does this text match a reference answer?", they ask "does this text satisfy formatting constraints given in the instructions?" + +For instance, instructions might specify: +- *"Include exactly 3 bullet points"* → verify the output contains exactly 3 bullets +- *"Capitalize only the first sentence"* → parse and check capitalization patterns +- *"Use the word 'algorithm' at least twice"* → count word occurrences +- *"Your response must be in JSON format with keys 'answer' and 'reasoning'"* → validate JSON structure + +Each constraint can be checked with a specific rule-based verifier, making these evaluations more unambiguous, interpretable, fast, and considerably less costly than using models as judges. + +This functional approach works particularly well for instruction following, but requires creativity to extend to other text properties. The key is identifying aspects of text that can be verified programmatically rather than through semantic comparison. + + +Functional testing is inspired by code evaluation, where functional testing through unit tests is standard practice (checking if generated code produces correct outputs for given inputs). + + +### With humans +Human evaluation is simply asking humans to score predictions. + +Human evaluation is very interesting, because of its **flexibility** (if you define clearly enough what you are evaluating, you can get scores for about anything!), **inherent un-contamination** (if humans write new questions to test your system, they should not be present in your training data, hopefully), and **good correlation with human preference** for obvious reasons. + + +However, when doing evaluation with humans, you need to make sure your annotators are diverse enough that your results generalizes. + + +Different approaches exist to evaluate models with humans in the loop. + +**Vibe-checks** is the name given to manual evaluations done by individual members of the community, usually on undisclosed prompts, to get an overall "feeling" of how well models perform on their use cases of preference. (I've also seen the term "canary-testing" used for this, in reference to high signal canary in a coalmine approach). Said use cases can be anything from the most exciting to the most mundate - to cite some I've seen on Reddit, they covered legal questions in German, coding, tool use, quality of erotica written, etc. Often shared on forums or social media, they mostly constitute anecdotal evidence, and tend to be highly sensitive to confirmation bias (in other words, people tend to find what they look for). + + + +Using community feedback to establish massive model rankings is what we call an **arena**. A well known example of this is the [LMSYS chatbot arena](https://huggingface.co/spaces/lmsys/chatbot-arena-leaderboard), where community users are asked to chat with models until they find one is better than the other. Votes are then aggregated in an Elo ranking (a ranking of matches) to select which model is "the best". The obvious problem of such an approach is the high subjectivity - it's hard to enforce a consistent grading from many community members using broad guidelines, especially since annotators preferences tend to be [culturally bound](https://arxiv.org/abs/2404.16019v1) (with different people favoring different discussion topics, for example). One can hope that this effect is smoothed over by the sheer scale of the votes, through a "wisdom of the crowd" effect (this effect was found by a statistician named Galton, who observed that individual answers trying to estimate a numerical value, like the weight of a hog, could be modeled as a probability distribution centered around the actual answer). + +The last approach is **systematic annotations**, where you provide extremely specific guidelines to paid selected annotators, in order to remove as much as the subjectivity bias as possible (this is the approach used by most data annotation companies). However, it can get extremely expensive fast, as you have to keep on doing evaluations in a continuous and non automatic manner for every new model you want to evaluate, and it can still fall prey to human bias (this [study](https://arxiv.org/abs/2205.00501) showed that people with different identities tend to rate model answer toxicity very differently). + +Vibe-checks are a particularly [good starting point for your own use cases](https://olshansky.substack.com/p/vibe-checks-are-all-you-need), as you'll be testing the model on what's relevant to you. Pros of casual human evaluations are that they are cheap and allow to discover fun edge cases since you leverage user's creativity in a mostly unbounded manner, you can discover interesting edge cases. However, they can be prone to blind spots. For example, there was a debate in the scientific community as to whether LLMs [can draw](https://arxiv.org/abs/2303.12712) unicorns [or not](https://twitter.com/DimitrisPapail/status/1719119242186871275). A year later, seems like most can! + + +Once you want to scale to more systematic evaluation with paid annotators, you'll find that there are 3 main ways to do so. If **you don't have a dataset**, but want to explore a set of capabilities, you provide humans with a task and scoring guidelines (e.g *Try to make both these model output toxic language; a model gets 0 if it was toxic, 1 if it was not.*), and access to one (or several) model(s) that they can interact with, then ask them to provide their scores and reasoning. If **you already have a dataset** (eg: a set of *prompts that you want your model to never answer*, for example for safety purposes), you preprompt your model with them, and provide the prompt, output and scoring guidelines to humans. If **you already have a dataset and scores**, you can ask humans to review your evaluation method by doing [error annotation](https://ehudreiter.com/2022/06/01/error-annotations-to-evaluate/) (*it can also be used as a scoring system in the above category*). It's a very important step of testing new evaluation system, but it technically falls under evaluating an evaluation, so it's slightly out of scope here. + +Pros of systematic human evaluations, especially with paid annotators, are that you're **getting high quality and private data** adapted to your use case (especially if you rely on in house annotators), which are mostly **explainable** (scores obtained by the models will be explainable by the humans who gave them). +However, it's more costly (especially as you'll most likely need rounds of annotations to adapt your guidelines) and does not scale well. + +Overall, however, human evaluation has a number of well known biases, based first impressions, tone, alignement with annotators value, etc, see the figure below. + + + +These biases are not unexpected, but they must be taken into account: not all use cases should rely on using cheap human annotators - any task requiring factuality (such as code writing, evaluation of model knowledge, etc) should include another, more robust, type of evaluation to complete the benchmark (experts, automatic metrics if applicable, etc). + +### With judge models +To mitigate the cost of human annotators, some people have looked into using models or derived artifacts (preferably aligned with human preferences) to evaluate models' outputs. + +This approach is not new, as you can find techniques to measure summarization quality from [model embeddings](https://arxiv.org/abs/1904.09675) in 2019. + +Judge models are simply **neural network used to evaluate the output of other neural networks**. In most cases, they evaluate text generations. + +Two approaches exist for grading: using [generalist, high capability models](https://arxiv.org/abs/2306.05685v4) or using [small specialist models](https://arxiv.org/pdf/2405.01535) trained specifically to discriminate from preference data (think "spam filter", but for toxicity for example). In the former case, when using an LLM as a judge, you give it a prompt to explain how to score models (ex: `Score the fluency from 0 to 5, 0 being completely un-understandable, ...`). + +Model as judges allow to score text on complex and nuanced properties. +For example, an exact match between a prediction and reference can allow you to test if a model predicted the correct fact or number, but assessing more open-ended empirical capabilities (like fluency, poetry quality, or faithfulness to an input) requires more complex evaluators. + +They are used on 3 main tasks: +- *Scoring a model generation*, on a provided scale, to assess a property of the text (fluency, toxicity, coherence, persuasiveness, etc). +- *Pairwise scoring*: comparing a pair model outputs to pick the best text with respect to a given property +- *Computing the similarity* between a model output and a reference + + In this document, I'll focus on the LLMs + prompt approach for now, but you should definitely check out how classifier judges work, as I think it can be fairly robust and well adapted to a number of use cases, and the recently introduced and promising reward model as judge approach (introduced in [this tech report](https://research.nvidia.com/publication/2024-06_nemotron-4-340b), and on which we have a small page [here](https://github.com/huggingface/evaluation-guidebook/blob/main/contents/model-as-a-judge/what-about-reward-models.md)) + +#### Pros and cons of using judge-LLMs +People in favor of judge LLMs have been claiming they provide better: +- **Objectivity** when compared to humans: They automate empirical judgments in an objective and reproducible manner (theoretically - in my opinion, they add more subtle bias than they are worth) +- **Scale and reproducibility**: They are more scalable than human annotators, which allows to reproduce scoring on large amounts of data (if you control for temperature). +- **Cost**: They are cheap to instantiate, as they don't require to train a new model, and can just rely on good prompting and an existing high quality LLM. They are also cheaper than paying actual human annotators (capitalism...). + +In my opinion, using LLM judges correctly is extremely tricky, and it's **easy to be deceived for critical use cases**: +- LLM as judges seem objective, but they have many **hidden biases** that can be harder to detect than the ones in humans, since we're not as actively looking for them (see below). Besides, there are ways to reduce human bias by designing survey questions in specific and statistically robust ways (which has been studied in sociology for about a century), where LLM-prompting is not as robust yet. Using LLMs to evaluate LLMs has been compared to creating an echo-chamber effect, by reinforcing biases subtly. +- They are indeed scalable, but contribute to creating **massive amounts of data** which themselves need to be examined to ensure their quality (for example, you can improve the quality of LLM-judges by asking them to generate a thinking trace, or reasoning around their data, which makes even more new artificial data to analyse) +- They are indeed cheap to instantiate, but are not as good as paying actual expert human annotators for your specific use cases. + + + +This section is therefore a bit long, because you need to be well aware of the limitations of using model as judges: a lot of people are blindly jumping into using them because they seem easier than actually working with humans or designing new metrics, but then end up with uninsterpretable data with tricky bias to extract. + +My main personal gripe with using models as judges is that they introduce very subtle and un-interpretable bias in the answer selection. I feel that, much like when crossbreeding too much in genetics studies, you end up with dysfunctional animals or plants, by using LLMs to select and train LLMs, we are just as likely to introduce minute changes that will have bigger repercussions a couple generations down the line. I believe this type of bias is less likely to occur in smaller and more specialized models as judges (such as toxicity classifiers), but this remains to be rigorously tested and proven. + + +If you want to give it a go, I suggest first reading this [very good guide](https://huggingface.co/learn/cookbook/en/llm_judge) on how to setup your first LLM as judge! + +You can also try the [distilabel](https://distilabel.argilla.io/latest/) library, which allows you to generate synthetic data and update it using LLMs. They have a nice [tutorial](https://distilabel.argilla.io/latest/sections/pipeline_samples/papers/ultrafeedback/) applying the methodology of the [Ultrafeedback paper](https://arxiv.org/abs/2310.01377) as well as a [tutorial on benchmarking](https://distilabel.argilla.io/latest/sections/pipeline_samples/examples/benchmarking_with_distilabel/) implementing the Arena Hard benchmark. + + +#### Getting a Judge-Model + +When using an existing LLM, you can go for [generalist, high capability models](https://arxiv.org/abs/2306.05685v4), [small specialist models](https://arxiv.org/abs/2405.01535) trained specifically to discriminate from preference data, or training your own. + +**Using a generalist LLM** + +With the introduction of more capable LLMs (such as ChatGPT), some researchers started exploring using big models as judges. + + + +**Closed source models (Claude, GPT-o) tradeoffs:** + +Disadvantages: +- **Non-reproducible**: Models can change without notice via API updates +- **Black box**: Un-interpretable decision-making +- **Privacy risks**: Data sent to third parties, potential leakage + +Advantages: +- Easy access without local setup or hardware requirements + +**Open source models are closing the gap** while solving reproducibility and interpretability issues. Models like DeepSeek R1, gpt-oss, and the recent Qwen models are now competitive alternatives. + + + +You'll find a good cost analysis of model providers [here](https://huggingface.co/spaces/ArtificialAnalysis/LLM-Performance-Leaderboard) if you need help picking one. + +**Using a tiny specialized LLM judge model** + +You can also make the choice to use tiny specialized LLM judges. With often a couple billion parameters, they can run locally on most recent consumer hardware, while being trained from scratch or fine-tuned using instruction data. You often need to follow their specific prompt formats. + +Some existing models as of 2024 were Flow-Judge-v0.1 ([weights](https://huggingface.co/collections/flowaicom/flow-judge-v01-66e6af5fc3b3a128bde07dec)), 3.8B parameters, a Phi-3.5-mini-instruct fine-tuned on a synthetic preference dataset, Prometheus ([weights](https://huggingface.co/prometheus-eval/prometheus-13b-v1.0), [paper](https://arxiv.org/abs/2310.08491)), 13B parameters, a model trained from scratch on synthetic preference dataset, and JudgeLM ([paper](https://arxiv.org/abs/2310.17631)), 7B to 33B parameters, models trained from scratch on synthetic preference datasets generated with a variety of models. Newer alternatives surely exist! + +**Training your own** +You can also make the choice to train or fine-tune your own LLM-as-judge. (I would avoid doing this, unless you are on a very niche domain). + +If you go in that direction, you'll first need to gather preference data for your task of interest, which can come +- From existing [human preference datasets](https://www.kaggle.com/competitions/lmsys-chatbot-arena) +- From model generated preference data (which you can generate following the above tiny-model judges papers data sections, or get directly, for example from the Prometheus [preference](https://huggingface.co/datasets/prometheus-eval/Preference-Collection) and [feedback](https://huggingface.co/datasets/prometheus-eval/Feedback-Collection) collections). + +Then you need to decide whether to start from a small model to train from scratch, or from an existing model, that you can distill into a new smaller model, or quantize, then fine-tune (using peft or adapter weights if the model is big and your training compute low) using the above data. + + Apparently [starting from a reward model works better than from an instruct model](https://x.com/dk21/status/1826292289930674590) + +#### Designing your evaluation prompt + +Once you've selected your model, you need to define what is the best possible prompt for your task. + + +Provide a clear description of the task at hand: +- *Your task is to do X.* +- *You will be provided with Y.* + +Provide clear instructions on the evaluation criteria, including a detailed scoring system if needed: +- *You should evaluate property Z on a scale of 1 - 5, where 1 means ...* +- *You should evaluate if property Z is present in the sample Y. Property Z is present if ...* + +Provide some additional "reasoning" evaluation steps: +- *To judge this task, you must first make sure to read sample Y carefully to identify ..., then ...* + +Specify the desired output format (adding fields will help consistency) +- *Your answer should be provided in JSON, with the following format \{"Score": Your score, "Reasoning": The reasoning which led you to this score\}* + + +You can and should take inspiration from [MixEval](https://github.com/huggingface/lighteval/blob/main/src/lighteval/tasks/extended/mix_eval/judge_prompts.pyy) or [MTBench](https://github.com/huggingface/lighteval/blob/main/src/lighteval/tasks/extended/mt_bench/judge_prompt_templates.py) prompt templates. + + +Pairwise comparison [correlates better with human preference](https://arxiv.org/abs/2403.16950) than scoring, and is more robust generally. + +If you really want a score, use an integer scale make sure you provide a detailed explanation for what [each score represents](https://x.com/seungonekim/status/1749289437165769177), or an additive prompt (*provide 1 point for this characteristic of the answer, 1 additional point if ...* etc) + +Using one prompt per capability to score tends to give better and more robust results + + +You can also improve accuracy using the following, possibly more costly, techniques: +- **Few shot examples**: like in many other tasks, if you provide examples it can help its reasoning. However, this adds to your context length. +- **Reference**: you can also enhance your prompt with a reference if present, which increases accuracy +- **CoT**: [improves accuracy for older gen models](https://arxiv.org/abs/2212.08073), if you ask the model to output its chain of thought **before** the score (also observed [here](https://x.com/seungonekim/status/1749289437165769177)) +- **Multiturn analysis**: can improve [factual error detection](https://arxiv.org/abs/2305.13281) +- Using **a jury** (many judges, where you pick an aggregate of the answers): [gives better results](https://arxiv.org/abs/2404.18796) than using a single model. It can be made considerably less costly by leveraging many smaller models instead of one big expensive model. You can also experiment with using one model with variations on temperature +- Surprisingly, the community has found that adding stakes to the prompts (`answer correctly and you'll get a kitten`) can increase correctness. Your mileage may vary on this one, adapt to your needs. + +If you are working on critical tasks (medical domain for example), make sure to use methodologies transferred from the humanities, and 1) compute inter-annotator agreement metrics to make sure your evaluators are as unbiased as possible, 2) Use proper survey design methodology when creating your scoring grid to mitigate bias. However, most people don't really want a reproducible and high quality unbiased eval, and will be happy with quick and dirty evaluation through OK-ish prompts. (Which is an OK situation to be in! Just depends on the consequences attached). + +#### Evaluating your evaluator + +Before using a judge-LLM in production or at scale, you want to evaluate its quality for your task, to make sure its scores are actually relevant and useful for you. + + +This will be easier to do if it predicts binary outputs, because you'll be able to interpretable classification metrics (accuracy/recall/precision). If it predicts scores on a scale, it will be much harder to estimate the quality of the correlation with a reference. Models are notoriously bad at predicting on a scale. + + +So, once you have selected your model judge and its prompt, you'll need to do the following. + +1. **Pick your baseline** +You'll need to compare your evaluator judgments to a baseline: it can be human annotations, the output of another judge model that you know is qualitative on your task, a gold truth, itself with another prompt, etc. + + + +You don't need many baseline examples (50 can suffice), but they must be: +- **Representative**: Cover the full range of your task +- **Discriminative**: Include edge cases and challenging examples +- **High quality**: Use the best reference data you can obtain + + + +2. **Pick your metric** +Your metric will be used to compare your judge's evaluations with your reference. + +In general, this comparison is considerably easier to do if your model is predicting binary classes or doing pairwise comparison, as you'll be able to compute accuracy (for pairwise comparison), or precision and recall (for binary classes), which are all very easy to interpret metrics. + +Comparing the correlation of scores with human or model scoring will be harder to do. To understand why in more detail, I advise you to read this cool [blog section on the topic](https://eugeneyan.com/writing/llm-evaluators/#key-considerations-before-adopting-an-llm-evaluator). + +In general, if you're a bit lost about what metrics to pick when (in terms of models, metrics, ...), you can also look at [this interesting graph](https://eugeneyan.com/assets/llm-eval-tree.jpg) from [the same above blog](https://eugeneyan.com/writing/llm-evaluators/) ⭐. + +3. **Evaluate your evaluator** +For this step, you simply need to use your model and its prompt to evaluate your test samples! Then, once you get the evaluations, use your above metric and reference to compute a score for your evaluations. + +You need to decide what your threshold for acceptance is. Depending on how hard your task is, you can aim for 80% to 95% accuracy, if you're doing pairwise comparison. Regarding correlations (if you're using scores), people in the literature tend to seem happy with 0.8 Pearson correlation with a reference. However, I've seen some papers declare that 0.3 indicates a good correlation with human annotators (^^") so ymmv. + +#### Tips and tricks + + +We discussed in this section's [intro](http://localhost:4321/#pros-and-cons-of-using-judge-llms) a number of LLM judges biases. Let's see how you should try to mitigate them. + +**Lack of internal consistency**: +➡️ You can mitigate this by doing self-consistency prompting of your judge, prompting it multiple times and keeping the majority output + +**Self-preference**: +➡️ You can mitigate this by using a jury + +**Blindness to input perturbation**: +➡️ asking the model to explain its reasoning [before providing a score](https://twitter.com/seungonekim/status/1749289437165769177) +➡️ or providing a coherent grading scale in the prompt. + +**Position-bias**: +➡️ switching answer positions randomly +➡️ computing the log-probabilities of all possible choices to get a normalized answer + +**Verbosity-bias** (or length-bias): +➡️ You can mitigate this by [accounting for the answer difference in length](https://arxiv.org/abs/2404.04475) + +**Format bias**: +➡️ You can mitigate this by paying attention to the training prompt format (if the model was instruction tuned) and ensuring you follow it. + + +**Picking correct tasks for an LLM judge** + +LLM evaluators: +- are **bad at identifying hallucinations** in general, particularly what are called partial hallucinations (which look close to the ground truth but are actually slightly different) (see [this](https://arxiv.org/abs/2305.11747) and [this](https://arxiv.org/abs/2303.08896)) +- have a low to OK-ish correlation with human annotators on [summarization](https://arxiv.org/abs/2304.02554) ([here too](https://arxiv.org/abs/2303.16634)), [faithfulness](https://arxiv.org/abs/2307.16877), and are not consistently correlated with human judgement more broadly against [a scope of tasks](https://arxiv.org/abs/2406.18403) + +#### What about Reward Models? + +Reward models learn to predict a score from human annotations for given prompt/completion pairs. The end goal is for them to do predictions aligned with human preference. +Once trained, these models can then be used to improve other models, by acting as a a reward function which is a proxy for human judgment. + +The most common type of reward model is the Bradley-Terry model, which outputs a single **pairwise score**, following: + +$$p(\text{completion b is better than completion a}) = \text{sigmoid}(\text{score}_b - \text{score}_a)$$ + +This model is trained using only pairwise comparisons of completions, which are easier to collect than scores, but can only compare several completions for one prompt, and not completions across prompts. + +Other models have expanded on this approach to predict a more nuanced probability that a completion is better than the other one ([example](https://huggingface.co/RLHFlow/pair-preference-model-LLaMA3-8B)). + +This allows them to (theoretically) judge subtle differences between completions, at the cost of not being able to easily save and compare many different scores across prompts for the same test set. In addition, context length and memory limits can become an issue when comparing too long completions. + +Some reward models such as [SteerLM](https://arxiv.org/abs/2311.09528) output **absolute scores**, which can be used to evaluate completions directly without the need for pairwise comparisions. These models can be easier to use for evaluation, but are also harder to collect data for, as absolute scores tend to be less stable than pairwise scores in human preferences. + +More recently, models have been proposed that output both absolute and relative scores, such as [HelpSteer2-Preference](https://arxiv.org/abs/2410.01257) and [ArmoRM](https://arxiv.org/abs/2406.12845). + + + +Given a dataset of prompts, we can generate completions from a language model and ask a reward model to score them. + +For models that give absolute scores, the resulting scores can be averaged to get a reasonable summary score. + +However, in the more common case of relative scores, the average reward can be biased by outliers (a few very good or very bad completions) as different prompts may have inherently different reward scales (some prompts are way harder or easier than others). + + + +For relative scores, don't just average raw rewards—outliers and varying prompt difficulty scales will bias results. Use win rates or win probabilities against a reference instead. + + + +Instead, we can use +- win rates: take a reference set of completions and calculate the percentage of completions from the model that are ranked higher than the reference completions. It is slightly more granular. +- win probabilities: the mean probability of the completions being better than the reference completions, which can give a more fine-grained and smoothly changing signal. + + + +Reward models are typically: +- **Very fast**: Getting a score is as simple as running a forward pass of a relatively small model once (since we only get a score, and not long text, contrary to judge-LLMs) +- **Deterministic**: The same scores will be reproduced through the same forward pass +- **Unlikely to suffer from positional bias**: As most models take only one completion, they can not be influenced by the order. For pairwise models, positional bias is often also minimal, as long as the training data was balanced with respect to containing both first and second answers as being the best. +- **Require no prompt engineering**: since the model will simply output a score from one or two completions depending on preference data it's been trained on. + +On the other hand they: +- **Require specific fine-tuning**: This can be a relatively costly step, and elthough they inherit many capabilities from a base model, they may still perform poorly on tasks that are out of the training distribution. +- **Loose efficiency when used both in reinforcement learning and evaluation** (or when using direct alignment algorithms on datasets that are similar to the training data of the reward model), as the language model may overfit to the reward model's preferences. + + + + +- A good place to find high performing models is the [RewardBench Leaderboard](https://huggingface.co/spaces/allenai/reward-bench). +- You can look at how reward models have been used in the [Nemotron](https://arxiv.org/abs/2406.11704) paper. +- For reward models that rate single prompts and completions, you can cache the scores of many reference models and easily see how a new model performs. +- Tracking of win rates or probabilities over training, e.g. as in [this](https://arxiv.org/abs/2410.11677v1) paper, can allow you to detect model degradation and select optimal checkpoints. + + +### Constraining model outputs +In a number of cases, we might want the model to output a prediction which follows a very specific format to simplify evaluation. + +#### Using a prompt +The easiest way to do this is to add a task prompt which contains very specific instructions as to how the model should answer (`Provide numerical answers in digits.`,`Use no abbreviation.`, etc). + +It won't necessarily work all the time but should be good enough for high capability models. That's the approach we followed in the [GAIA](https://huggingface.co/papers/2311.12983) paper for example. + +#### Few shots and in context learning +The next way to do so is to constrain the model through what is called "in context learning". By providing examples in the prompt (what is called `few-shot prompting`), the model is implicitly biased towards following the repeated prompt shape for the actual sample. + + +It's a method which was overall working quite well until end of 2023! + +However, the widespread adoption of instruction-tuning methods and the addition of instruction data in later stages of model pre-training (continuous pre-training) has biased more recent models towards specific output formats (what is being called [here](https://arxiv.org/abs/2407.07890) *Training on the test task*, and what I would call *overfitting the prompt format*). Reasoning models are also not playing that well with few shot examples because of the reasoning trace. + +It's also a method which can be limited for older models with smaller context sizes, as some few-shot examples can not fit into the context window. + + +#### Structured text generation +Structured text generation constrains the outputs to follow a given path, defined by a grammar or by regular expressions, for example. The `outlines` library implements this using finite state machines, which is very neat. (Other approaches exist, such as using interleaved generation for json generation, but the FSM one is my favorite). + +To understand more about what happens when using structured generation, you can check the [blog](https://huggingface.co/blog/evaluation-structured-outputs) we wrote together: structured generation reduce prompt variance in evaluation, and make results and rankings more stable. You can also check the overall `outlines` [blog](https://blog.dottxt.co/) for interesting implementations and observations linked to structured generation. + +However, some recent [research](https://arxiv.org/abs/2408.02442) seems to show that structured generation can lower model performance on some tasks (like reasoning), by moving the prior too far away from the expected probability distribution. + + +- ⭐ [Understanding how Finite State Machine when using structured generation](https://blog.dottxt.co/coalescence.html), by Outlines. Super clear guide on how their method works! +- [The outlines method paper](https://arxiv.org/abs/2307.09702), a more academic explanation of the above +- [Interleaved generation](https://github.com/guidance-ai/guidance?tab=readme-ov-file#guidance-acceleration), another method to constrain generations for some specific output formats + + +## The forgotten children of evaluation + +### Statistical validity + +When reporting evaluation results, it's critical to include **confidence intervals** alongside point estimates. + +These confidence intervals from the raw scores can be obtained from standard deviations over the scores or [bootstrapping](https://en.wikipedia.org/wiki/Bootstrapping_(statistics)) - for automatic metrics, this is relatively trivial - for model judges, a [recent paper](https://arxiv.org/pdf/2511.21140) suggested bias correction with estimators. For human based evaluations, you should report agreement. + +You can also compute these with prompt variations, by asking the same questions in slightly different ways, or re-running on the same samples with different prompt formats. + +### Cost and efficiency + + +When designing and reporting evaluation results, we need to start collectively reporting results against model running costs! A reasoning model which requires 10 minutes of thinking and 10K tokens to answer 10 + 20 (because it decides to make an entire segue on binary vs decimal arithmetics) is considerably less efficient than a smol model answering 30 in a handful of tokens. + +
        +Environmental impact metrics for model evaluation +
        + +We suggest you report the following: +- **Token consumption**: Report the total number of output tokens used during evaluation. This is particularly important to estimate **efficiency**, and it will affect the cost of model as judge evaluations. Token counts directly impact monetary costs and help others estimate the computational requirements. **Monetary cost** can also be a good proxy for efficiency. + + These cost metrics can also be critical when comparing evaluation methods. For instance, while using a powerful LLM as a judge might provide better signal than automatic metrics, the 100x cost increase may not be justified for all use cases. Similarly, sampling-based metrics (pass@k, maj@n) multiply costs with the number of samples, which should be weighed against the improved signal they provide. + +- **Time**: Document the inference time required by the model to complete the evaluation. This includes both the actual inference time and any overhead from API rate limits. This is particularly important for any time-sensitive applications (like some agentic tool use, as in GAIA2). + +Last but not least, reporting the environmental footprint of the models you are running is becoming increasingly important with the overall state of resources available on earth. This includes carbon emissions from training and energy consumption at inference, and these will depend on the model size, hardware (if you know it) and the tokens generated. Some smaller or quantized models reach a very interesting performance to consumption ratio + + + + diff --git a/app/src/content/chapters/automated-benchmarks/some-evaluation-datasets.mdx b/app/src/content/chapters/automated-benchmarks/some-evaluation-datasets.mdx new file mode 100644 index 0000000000000000000000000000000000000000..48281c8db7c341affb4b86317e6a48f408ac1e52 --- /dev/null +++ b/app/src/content/chapters/automated-benchmarks/some-evaluation-datasets.mdx @@ -0,0 +1,165 @@ +--- +title: "Some evaluation datasets" +--- + +import Note from "../../../components/Note.astro"; + +### Some evaluation datasets + +If the task you are interested is already well studied, chances are that a dataset exists for it. + +Below are a number of evaluation datasets which were developed in the last few years. + + + +Many datasets listed here may be: +- **Obsolete**: Designed pre-LLM for specific properties (translation, summarization) no longer central to model evaluation +- **Contaminated**: Publicly available for years, likely in training data + +However, contamination doesn't mean these datasets have no signal for your task! + + +### Math specific datasets + +| Evaluation name | Task type | Publication date | Data size | Task data | Task/Paper content | Source | Dataset | Comments | +|----- |------ |- |-- |------------|------------- |--------|-------- |---------- | +| AGIEval (SATMath) | Exam dataset + existing datasets | 2023 | 220 | Math problems from the SAT | Paper is actually a compilation of a bunch of human relative exams to use as eval data. | [Paper](https://arxiv.org/abs/2304.06364) | [HuggingFace](https://huggingface.co/datasets/hails/agieval-sat-math) | - Careful, this paper also includes datasets from other papers! For math, they use AIME & AMC through the MATH dataset, GRE & GMAT through the AQuA-Rat dataset, and GaoKao
        - Metrics: acc/em/f1 | +| AIME (all) | Olympiad dataset | 1983-now | 15 x 2 per year | Mathematical problems requiring a combination of arithmetic, algebra, counting, geometry, number theory, probability and other secondary school math topics | 2nd exam to choose the US team for the International Math Olympiads | [Blog](https://artofproblemsolving.com/wiki/index.php/American_Invitational_Mathematics_Examination) | [Source](https://artofproblemsolving.com/wiki/index.php/AIME_Problems_and_Solutions) | Answer is systematically an integer between 0 and 999. | +| AIME (22, 23 and 24) | Olympiad dataset | 2024 | 90
        | See AIME (all) | | [Paper](https://artofproblemsolving.com/wiki/index.php/American_Invitational_Mathematics_Examination) | [HuggingFace](https://huggingface.co/datasets/AI-MO/aimo-validation-aime) | Used in the AIMO competition | +| ALGES (SingleEQ) | Online sources compilations | 2015 | 508 | Grade school algebra problems extracted from sources in the web | Paper is about implicitely learning and solving the simple equation behind the problem | [Paper](https://aclanthology.org/Q15-1042/) | [Source](https://gitlab.cs.washington.edu/ALGES/TACL2015/-/blob/master/questions.json?ref_type=heads) | - Web sources: http://math-aids.com, http://k5learning.com, and http://ixl.com
        - Pre-LLM paper - data sources are probably OK | +| ALG514 or AllEq | Online forum | 2014 | 514 | Algebra word problems extracted from a crowdsourced tutoring website, cleaned with turking, and manually verified | Paper is about extracting the equation template from the problem to solve it | [Paper](https://aclanthology.org/P14-1026/) | [Source](https://groups.csail.mit.edu/rbg/code/wordprobs/questions.json) | - Web source: Algebra.com | +| AMC 12 | Olympiad dataset | 2000 - now | 25 per year | Mathematical word problems requiring arithmetic, algebra, counting, geometry, number theory, probability and other secondary school math topics. | 1st exam to select the US team for the International Math Olympiad, used to be the Americal High School Math exam | [Blog](https://artofproblemsolving.com/wiki/index.php/AMC_12) | [Source](https://artofproblemsolving.com/wiki/index.php/AMC_12_Problems_and_Solutions) | - Problems are designed to be solvable by students without any background in calculus. | +| Ape210K | Exam dataset | 2020 | 210K problems, 56K templates | Chinese elementary school-level math word problems written by math teachers | Solve the problems | [Paper](https://arxiv.org/abs/2009.11506) (withdrawn, but v1 still accessible) | [HuggingFace](https://huggingface.co/datasets/MU-NLPC/Calc-ape210k) | - Some problems are templated, which could be interesting for contamination issues
        - Initial dataset is 900K and was manually filtered
        - Also provides "intermediate equations" (useful to test CoT traces if needed)
        - Dataset is in Chinese
        - Intended to be partially used for training | +| AQUA or AQUA-Rat | Exam dataset + turked dataset | 2017 | 100K | Algebraic word problems constructed from a seed of 34K problems from GMAT and GRE, and extended via turking | Task: Solve the problem | [Paper](https://arxiv.org/abs/1705.04146) | [HuggingFace](https://huggingface.co/datasets/deepmind/aqua_rat) | - Intended to be partially used for training
        - Includes the rationale for the problems
        - Use accuracy, BLEU and perplexity for scoring ... | +| ASDiv-A | Online sources compilation | 2020 | 2.3K | Math world grade-school problems collected from various websites and normalized | Task: Solve the problem | [Paper](https://aclanthology.org/2020.acl-main.92/) | [Github](https://github.com/chaochun/nlu-asdiv-dataset) | - Contains problem type and grade level annotations by a Master’s student annotator
        - Focused on a high lexical diversity
        - Used 28 websites | +| CHAMP | Olympiad dataset | 2024 | 270 | Math word problems extracted from a book of olympic competitions examples, rewritten to make the solutions parsable, and annotated | Introduces a math bench. Problems are extended with hints and labeled with concepts, to allow ablation studies on performance | [Paper](https://arxiv.org/abs/2406.18321) | | - Source: Book "Problem-Solving strategies" (Engel, 2008)
        | +| DeepMind Math | Exam dataset + synthetic dataset | 2019 | 10K? | Synthetic raw math problems in algebra, arithmetic, calculus, comparision, conversions between units, polynomials, probability, etc. | Task: Solve the problem | [Paper](https://arxiv.org/abs/1904.01557) | [HuggingFace](https://huggingface.co/datasets/deepmind/math_dataset) | - Full list of domains in appendix B
        - Paper first section is quite nice
        - Provide generation code to generate more examples
        - Provides additional train set
        - Synthetic procedural dataset (inspired from/to extend? school exams dataset) | +| DocMath-Eval | Annotated financial reports + exisiting Fin math datasets | 2023 | 3.2K | Combine financial reports and existing datasets, read by annotators to generate (or validate) questions, and provide answers as Python programs, then evaluated with domain expert annotators | Solutions should be presented as Python programs which will be run to test their validity | [Paper](https://arxiv.org/abs/2311.09805) | [Source](https://github.com/yale-nlp/docmath-eval) | - Re-uses TAT-QA, FinQA, MultiHiertt, TAT-HQA
        - Looks quite high q for math fin data!
        - Provides additional train set | +| Dolphin1878 | Online sources compilations | 2015 | 1.5K | Number math word problems sampled from online sources and re-annotated if needed | Paper is about extracting the equation (DOL) tree from the problem using semantic parsing | [Paper](https://aclanthology.org/D15-1135.pdf) | ? | - Sources: algebra.com and answers.yahoo.com
        | +| Dolphin18K | Online sources compilations | 2016 | 18K | Math word problems semi automatically extracted from online sources | | [Paper](https://aclanthology.org/P16-1084.pdf) | [Kaggle](https://www.kaggle.com/datasets/saurabhshahane/sigmadolphin) | - Sources: Math category of Yahoo answers since 2008
        - Method: manually annotate 6K problems then use a classifier. Train a model to extract the gold
        - I'm not sure the quality is amazing there given the amount of automatic extraction (high) vs manual verif (low) | +| Draw-1K | Online sources compilations | 2016 | 1K | General algebra word problems extracted from online sources | Paper is about evaluating solvers (systems generating equations from mwp), and test template and equation equivalence | [Paper](https://arxiv.org/abs/1609.07197) | [Source](https://www.microsoft.com/en-us/download/details.aspx?id=52628) | - Label each problem with the template it follows, can be useful for contam
        - Source: algebra.com | +| FinQA | Expert annotated financial reports | 2021 | 1.1K | Financial questions linked to tables from earning reports. Annotators provide a question + step by step process + annotations for each page. | Paper introduces the dataset plus a method made of a retriever which extracts relevant facts first to accomodate short context models, then a process generator. | [Paper](https://arxiv.org/abs/2109.00122) | [HuggingFace](https://huggingface.co/datasets/ibm/finqa) | - Likely high quality: used paid expert annotators + external experts annotators had high agreement on the task
        - Total set is 8.2K
        - Unsure from the paper how the tables are formatted - markdown maybe?
        - Data source: earnings reports of S&P 500 companies (1999-2019) | +| FrontierMath | Expert created datasset | 2024 | 100+ (precise number unknown) | Entirely novel math problems created for the paper across most math domains. Solutions are either integer or SymPy objects to be automatically verifiable through unit-test like python programs. Problems are labelled. | Introduces the dataset | [Paper](https://arxiv.org/abs/2411.04872) | Private | - Nice discussion of contamination in the paper
        - Experts: 60 mathematicians over 12 countries
        - All problems are peer reviewed
        - Probably the highest quality dataset here atm
        - Data is not public; however, since closed source models have been evaluated, it's likely they'll be contaminated for it in future occurences :( | +| GAOKAO-Bench (MathCloze, MathQA) | Exam dataset | 2023 | ~500 | Math word problems at high school level from the Chinese college entry exams | | [Paper](https://arxiv.org/abs/2305.12474) | [Source](https://github.com/OpenLMLab/GAOKAO-Bench?tab=readme-ov-file),
        Datasets are only 2023
        [HuggingFace](https://huggingface.co/datasets/hails/agieval-gaokao-mathcloze) and [HuggingFace](https://huggingface.co/datasets/hails/agieval-gaokao-mathqa) | - Mathematical formulas are converted to latex
        - Problems are in Chinese
        - Dataset is updated yearly
        - Paper explores a bunch of grading methods, including LLM as judge
        - Paper contains surprinsingly little info about the dataset | +| GAOKAO 2023 (MathEn) | Exam dataset, Competition dataset | 2023 | 385 | Math word problems at high school level | Compiles questions from the 2023 Chinese National College Entrance Examination, the 2023 American Mathematics Competitions, and the 2023 American College Testing | | [HuggingFace](https://huggingface.co/datasets/MARIO-Math-Reasoning/Gaokao2023-Math-En) | | +| GSM1K | Manually created dataset in the style of another dataset | 2024 | 1.2K | Diverse "grade school"-like math word problems, following the solving distribution of GSM8K | Paper does a contamination analysis of models on GSM8K vs GSM1K | [Paper](https://arxiv.org/abs/2405.00332) | Private | - Paper also seems to suggest that perplexity analysis is not very good at detecting contamination | +| GSM8K | Manually created dataset in the style of an exam dataset | 2021 | 8.5K | Diverse grade school-level math word problems | Paper is about training verifiers to solve math word problems | [Paper](https://arxiv.org/abs/2110.14168v2) | [Github](https://github.com/openai/grade-school-math)
        [Hugging Face](https://huggingface.co/datasets/gsm8k) | - Best results with an external calculator added
        - All answers are positive integers, 50% of answers are between 0 and 8
        - Annotation used Upwork for 1K problems, then Scale for the next. Problems writers were provided with seed questions from a 175B GPT3 model | +| iGSM (med and hard sets) | Synthetic dataset | 2024 | 20K | Problems are generated using a combination of dependency graphs between objects and categories (with direct, and implicit dependencies) and number of operations to generate new problems | Paper is about studying actual math reasoning on an extension of GSM8K, including probing internal model states | [Paper](https://arxiv.org/pdf/2407.20311) | [HuggingFace](https://huggingface.co/datasets/YangZhoumill/infini_igsm_4k_noise_close) | - Idea is theoretically nice but problems generated are very unrealistic with high numbers of operation
        - Paper focuses on "mental process" of model which I find dubious (though the probing section is nice!)
        - So much anthropomorphism -_- | +| GSMHard | Adaptation of existing dataset, numbers replaced | 2022 | 8.5K | GSM8K with bigger/less common numbers to make the problems harder. However, change was done automatically through programs generated, and only 25 changes were checked (+ 50 cases were done manually). | Paper is about using program-aided LMs (= generating CoT alternating equations and reasoning steps, and computing the solution on the last equation with Python) | [Paper](https://arxiv.org/abs/2211.10435) | [Hugging Face](https://huggingface.co/datasets/reasoning-machines/gsm-hard) | - Described in appendix H1
        - Good idea, but not sure of the quality. | +| GSM-IC | Adaptation of existing dataset with perturbations | 2023 | 58K | 100 samples from GSM8K with irrelevant context added (using a template for the irrelevant sentence, plus roles/numbers fillers) | Paper tests how sensitive LLMs are to irrelevant context when reasoning on math tasks | [Paper](https://arxiv.org/abs/2302.00093) | [HuggingFace](https://huggingface.co/datasets/voidful/GSM-IC) | | +| GSM-Plus | Adaptation of existing dataset with perturbations | 2024 | 10K | GSM8K with 8 variations per question, added by GPT4 and manually annotated by selected humans (cross annotator agreement checked) | Paper introduces the dataset and compares results on several GSM8K variants across models and prompting formats | [Paper](https://aclanthology.org/2024.acl-long.163/) | [HuggngFace](https://huggingface.co/datasets/qintongli/GSM-Plus) | - Changes include: replacing numbers by ohter numbers, changing the operations, changing the question, adding distractors, etc (nice typology of changes, I feel it could be extended) | +| GSM-Symbolic | Adaptation of existing dataset, templated | 2024 | 8.5K | Templated GSM8K problems, which allows to generate new evals at will | Paper creates parsable templates from GSM8K to be able to generate new problems at will and analyse contamination on GSM8K | [Paper](https://arxiv.org/abs/2410.05229) | To be released | - Contains other specific subsets (M1, P1, P2, which are difficulty levels, as well as NoOp, with seemingly relevant but actually irrelevant info added), and some experiments are done with few shot formats
        - Lacking a dataset description table with all subsets imo | +| Hungarian HighSchool National Finals Exam | Exam dataset | 2023 | 33 | Problems from the 2023 hungarian national high school finals in math | | [Source](https://dload-oktatas.educatio.hu/erettsegi/feladatok_2023tavasz_kozep/k_matang_23maj_fl.pdf) | [HuggingFace](https://huggingface.co/datasets/keirp/hungarian_national_hs_finals_exam) | - Require grading by hand atm | +| HMWP | Exam dataset | 2020 | 5.4K | Annotated math word problems from a Chinese school-level (K-12) problems bank | Introduces a new formalism to represent MWP equations uniformly | [Paper](https://arxiv.org/abs/2010.06823) | [HuggingFace](https://huggingface.co/datasets/Gxg/HWMP) | - Dataset is in Chinese
        - Sources: Chinese K12 problems | +| Math23K | Online sources compulation | 2017 | 23K | Automatically extracted elementary school level math word problems. | Introduces a RNN to solve MWP | [Paper](https://aclanthology.org/D17-1088/) | [HuggingFace](https://huggingface.co/datasets/Gxg/Math23K) | - Sources: Chinese math word problems from online education websites for elementary school students.
        - Dataset is in Chinese
        - Extraction is rule based, but it's very unclear how much manual validation was done | +| Math401-LLM | Synthetic dataset | 2023 | 401 | Arithmetic expressions combining additions, substractions, multiplications, exponentiations, logarithms etc | Papers wants to measure strict arithmetic ability of models | [Paper](https://arxiv.org/abs/2304.02015) | [Github](https://github.com/GanjinZero/math401-llm) | - Models are not that good atm for log/trig problems or big numbers | +| MATH | Olympiad datasets | 2021 | 12.5K | Mathematical problems from real competitions in natural language and latex, annotated with difficulty levels. | | [Paper](https://arxiv.org/abs/2103.03874) | [HuggingFace](https://huggingface.co/datasets/lighteval/MATH) | - Sources: AMC 10, AMC12, AOME, "and more"
        - Also introduces a train set created from scraping Khan academy and AMPS | +| MathOdyssey | | | | | | | | | +| MathQA | Adaptation of existing dataset, annotated | 2019 | 37K | Annotated solvable problems from the AQuA dataset with formal annotation programs (using humans as annotators and testing their agreement) | Aims to introduce a representation language for math problems, applies the method to AQuA | [Paper](https://arxiv.org/abs/1905.13319) | [HuggingFace](https://huggingface.co/datasets/allenai/math_qa) | - Sources: AQuA | +| MAWPS | Existing dataset compilation | 2016 | 3.3K | Math world problems from existing datasets | Framework to create new math problems, notably to remove lexical or template overlap when adding new datasets | [Paper](https://aclanthology.org/N16-1136/) | [Github](https://github.com/sroy9/mawps) | - Sources: ALG514, ALGES, and other pre-LLM datasets | +| MiniF2F | Olympiad dataset | 2022 | 244 | Olympiad math word problems formalized with theorem provers when possible (Lean, Methamath, Isabelle) | Paper is about testing math proof solvers ability to reason on formal logic | [Paper](https://arxiv.org/abs/2109.00110) | Possibly [HuggingFace](https://huggingface.co/datasets/cat-searcher/minif2f-lean4) | - Sources: AIME, AMC, IMO | +| NPHardEval | Synthetic dataset | 2023 | 900 | Complexity math word problems of varied difficulty level built from synthetic graph/linear data | Paper introduces the benchmark and uses it to evaluate reasoning ability of models. Also explores benchmark robustness! | [Paper](https://arxiv.org/abs/2312.14890) | [Github](https://github.com/casmlab/NPHardEval) | - Problems: sorted array search, edit distance, shortest path, traveling salesman, graph coloring, knapsack problem, meeting scheduling problem
        - Can be regenerated as needed | +| NuminaMATH CoT | Existing dataset compilation | 2024 | 860K | Math word problems (K12 + olympiad levels) combining existing datasets | NA | NA | [HuggingFace](https://huggingface.co/datasets/AI-MO/NuminaMath-CoT) | - Sources: AOPS, AMC, AIME, CN-K12, GSM8K, MATH, ORCA_math, Synthetic AMC and MATH data, and other Olympiads sets
        - careful if you use this as train set as you will be contaminated on all major math bencks | +| NuminaMATH TiR | Existing dataset compilation | 2024 | 72K | Subset of NuminaMATH CoT focused on problems solvable with tool integrated reasoning | NA | NA | [HuggingFace](https://huggingface.co/datasets/AI-MO/NuminaMath-TiR) | - Sources: AOPS, AMC, AIME, CN-K12, GSM8K, MATH, ORCA_math, Synthetic AMC and MATH data, and other Olympiads sets
        - careful if you use this as train set as you will be contaminated on all major math bencks | +| OmniMath | Olympiad datasets | 2024 | 2.2K | Olympiad math word problems. Problems are extracted from forums or olympiad websites (using rule based + LLM rephrasing), then annotated and verified by humans. | Paper introduces the benchmark and a judge trained to evaluate the answers (since they are free form) | [Paper](https://arxiv.org/abs/2410.07985) | [HuggingFace](https://huggingface.co/datasets/KbsdJames/Omni-MATH) | - Sources: IMO, IMC, AoPS forum and wiki
        - Domain labeling is done with LLMs | +| OlympiadBench | Olympiad datasets | 2024 | 8.4K | Olympiad/math/physics word problems. Answers are automatically evaluated - either numbers or equations (evaluated with SymPy) | | [Paper](https://arxiv.org/pdf/2402.14008) | | - Sources: Global Mathematics and Physics Olympiad Problems, Regional and National Chinese Math Competitions, and Gaokao Mock Questions for Mathematics and Physics
        - Includes a physics subset
        - VLM evaluation! | +| OlympicArena | Olympiad datasets | 2024 | 11K | | | [Paper](https://arxiv.org/pdf/2406.12753) | | | +| PRM800K | Synthetic data | 2023 | 800K | Preference data from annotators on 800K solutions generated by a model | Paper introducing process supervision to improve reward models (compares output and process supervision) | [Paper](https://arxiv.org/abs/2305.20050) | [HuggingFace](https://huggingface.co/datasets/tasksource/PRM800K) | - More a train set than an evaluation | +| SVAMP | Adaptation of existing dataset | 2021 | 1K | One-unknown arithmetic word problems of
        grade level up to 4, created with experts applying variations to ASDiv-A. | Paper wants to assess question sensitivitiy, reasoning ability, and structure invariance in models for math evaluations. | [Paper](https://aclanthology.org/2021.naacl-main.168/) | [Github](https://github.com/arkilpatel/SVAMP/blob/main/SVAMP.json) | - Variations: same object & different structure, opposite, both different, adding relevant or irrelevant information, changing information, inverting operations, changing order of sentences or objects | +| TabMWP | Online source adaptation | 2022 | 38K | Tabular math word problems requiring multi-hop reasoning, extracted from an online educative website and manually annotated. | Paper wants to test tabular math reasoning, datast | [Paper](https://arxiv.org/abs/2209.14610) | [HuggingFace](https://huggingface.co/datasets/Arietem/tabmwp) | - Source: IXL learning website
        - Tabular data is provided as an image, semi-structured text, and a table
        - Answers are generative or MCQA
        - Dataset is tested against turkers | +| TAL-SCQ5K-En | Competitions dataset | 2023 | 4K | Math word problems in MCQA format, with math expressions as latex | NA | None | [HuggingFace](https://huggingface.co/datasets/math-eval/TAL-SCQ5K) | - Contains English and Chinese
        - Also contains 6K train samples and CoT | +| TemplateGSM | LLM-generated data | 2024 | 7M | GPT4-generated math word problems inspired in shape by GSM8K | Paper uses GPT4 generated meta-template to generate problems by changing parameters. Uses a verificator to ensure usability | [Paper](https://templatemath.github.io/TemplateMath_Part_I.pdf) | [HuggingFace](https://huggingface.co/datasets/math-ai/TemplateGSM) | - Since everything is LLM generated, I would expect stronger proofs of quality | +| TheoremQA | Online sources adaptations | 2023 | 800 | QAs about university level theorems | Protocol: Uses GPT4 to enumerate subfields of relevant domains, then plausible theorems lists, then uses domain experts to actually look for said theorems, then look for QA on the web concerning them | [Paper](https://arxiv.org/abs/2305.12524) | [HuggingFace](https://huggingface.co/datasets/TIGER-Lab/TheoremQA) | | + +### Pre-LLM datasets + +| Evaluation name | Task type | Task data | Task content | Source | Dataset | Comments | +|--- |--- |--- |--- |--- |--- |--- | +| DeepFix | Code task, Code-to-code, Correction | 7K student-written erroneous C programs | Correct the C programs | [Paper](https://ojs.aaai.org/index.php/AAAI/article/view/10742) | | | +| MLSum | Generation, Multilingual, Summarization | 1.5M news summary/article pairs from the DailyMail, Le Monde, Süddeutsche Zeitung, El Pais, Moskovskij Komsomolets and Internet Haber (en, fr, de, es, ru, tur) | Summarize the articles | [Paper](https://arxiv.org/abs/2004.14900) | [Hugging Face](https://huggingface.co/datasets/mlsum) | Palm: Prefixed with a prompt, truncated article to 2048 tokens | +| TransCoder | Code task, Code-to-code | 852 parallel functions in Python/Java/C++ | Translate from a language to another | [Paper](https://arxiv.org/pdf/2006.03511.pdf) | [From paper](https://github.com/facebookresearch/CodeGen/blob/main/docs/transcoder.md) | | +| WMT | Multilingual, Translation | Datasets from the WMT conf on machine translation - datasets available depend on the year | Translate from a language to another | [Conference](https://www.statmt.org/wmt20/)
        Replace the 2 digits by the conference year | | | +| Adversarial NLI | Language Inference | 10K entailment dataset generated using human in the loop adversarial attacks, looking for predicates which force models to predict wrong entailement labels (uses contexts from StoryCloze, CommonCrawl, Wikipedia, the Open Annotated National Corpus, WikiHow and GLUE) | Predict entailment | [Paper](https://arxiv.org/abs/1910.14599) | [Data](https://dl.fbaipublicfiles.com/anli/anli_v1.0.zip )
        [Github](https://github.com/facebookresearch/anli) | R1 to R3 = rounds of data generation | +| APPS | Text-to-code| 10K Python coding problems in natural languages, scraped from leetcode sites, with a suite of test cases. | Solve the Python problem | [Paper](https://arxiv.org/abs/2105.09938) | [Github](https://github.com/hendrycks/apps)
        [Data](https://people.eecs.berkeley.edu/~hendrycks/APPS.tar.gz) | | +| AQuA | Arithmetic, Reasoning | 100K multiple choice problems (GMAT, GRE, other sources) with question/options/rationale | Select the correct MCQA | [Paper](https://arxiv.org/abs/1705.04146) | [Github](https://github.com/deepmind/AQuA) | Best results obtained with an external calculator added | +| ARC | Common Sense, Reasoning | 8K Grade school science questions: e = easy set, c = challenge set | Select the correct MCQA | [Paper](https://arxiv.org/abs/1803.05457) | [Data](https://allenai.org/data/arc) | Careful, this is the AI2 Reasoning Challenge, not the Abstraction and Reasoning Corpus | +| bAbI | Reasoning | 20 tasks each with 2K automatically generated questions + short scenarios (successive actions generated with a simulated text adventure game). | Reason over the sentence to select the correct conclusion | [Paper](https://arxiv.org/abs/1502.05698) | [Github](https://github.com/facebookarchive/bAbI-tasks)
        [Data](https://research.facebook.com/downloads/babi/) | See Part 4 for the simulation env and its constraints, it’s quite a fun idea. Probably not too hard to reproduce for other types of reasoning. | +| BBQ | Bias detection | 58K examples with two contexts (ambiguous and explicit about a bias), two questions (negative and non-negative) and possible answers, constructed from manual templates and checked with crowdsourcing. | Predict the correct, non biased answer. Difference between accuracies depending on context/question allows to build a bias score. | [Paper](https://aclanthology.org/2022.findings-acl.165/) | [Github](https://github.com/nyu-mll/BBQ/tree/main/data) | | +| BLiMP | Language Understanding | 67 datasets of each artificially generated 1K minimal pairs testing syntactic, morphological and semantic knowledge, validated with MTurking. | Accuracy measured by looking if the log-probability the model assigned to the correct sentence is higher. | [Paper](https://aclanthology.org/2020.tacl-1.25/) | [Github](https://github.com/alexwarstadt/blimp/tree/master/data) | Things tested: anaphor agreement, argument structure, binding, control/raisong, determiner-noun agreement, ellipsis, filler-gap, irregular forms, island effects, NPI licensing, quantifiers, subject-verb agreement | +| BOLD | Generation, Toxicity detection | 23K prompts extracted from beginning of Wikipedia sentences containing a race/religion/political/gender/profession group member (ex: woman artist for gender=female).
        | Task: generating end of sentence, and toxicity is evaluated through a range of metrics (sentiment analysis, using classifiers, …). In HELM, toxicity is measured using the Perspective API.
        | [Paper](https://arxiv.org/abs/2101.11718) | [Github](https://github.com/amazon-science/bold/tree/main/prompts) | | +| BooksCorpus | N/A | 11K unpublished books of more than 20K words scraped from the web (of 16 different genres).

        | In the original paper, it's used to train a sentence embedding model - in model papers, it's often used for contamination or perplexity evaluations. | [Paper](https://arxiv.org/pdf/1506.06724.pdf) | [Hugging Face](https://huggingface.co/datasets/bookcorpus) | | +| BooksCorpus_HELM| Generation, Memorisation | 1K randomly sampled books from BooksCorpus. | Task: from a random number of tokens beginning a paragraph, the model must generate a follow up - measure exact and near-exact reproduction. | [Paper](https://arxiv.org/abs/2211.09110) | [Data](https://drive.google.com/file/d/10uC4jM6tgI1pgtq--07FFHQ2Te7-SXGA/view) | | +| BoolQ | Language Inference, Language Understanding | 16K sentences of naturally occurring yes no QA, from question + context from Wikipedia | Answer the MCQA | [Paper](https://arxiv.org/abs/1905.10044) | [Website](https://super.gluebenchmark.com/tasks) | | +| CB | Language Understanding | 1.2K of discourse (news from the Wall Street Journal, fiction from the British National Corpus, dialogue from Switchboard), containing context + target sentence | Predict commitment entailment | [Paper](https://semanticsarchive.net/Archive/Tg3ZGI2M/Marneffe.pdf) | [Website](https://super.gluebenchmark.com/tasks) | | +| Civil comments | Toxicity detection | 1.8M online comments, with crowd sourced human labels for and toxicity following the Perspective API guidelines, and among these, 450K labeled with identity terms (crowdsourced, to pick in a list). | Task : toxicity prediction, labels are used to identify areas of biases in models | [Paper](https://arxiv.org/pdf/1903.04561.pdf) | [Kaggle](https://www.kaggle.com/c/jigsaw-unintended-bias-in-toxicity-classification)
        [Hugging Face](https://huggingface.co/datasets/civil_comments) | Original paper contained a synthetic test set (77K examples generated from templates using 50 identity terms, 50/50 on toxicity) and a human labeled dataset (description in the Task content col) - I suppose the dataset available is the latter | +| Clean E2E NLG | Description, Generation | 50K crowdsourced generated descriptions of restaurants given keys and values (type of food = X, budget = Y, …). | | [Paper](https://arxiv.org/abs/1706.09254) | [Hugging Face](https://huggingface.co/datasets/e2e_nlg_cleaned) | Palm: Prefixed with a prompt, truncated article to 2048 tokens | +| CNN/DailyMail | Cloze/Completion, Summarization | Original dataset: 200K new documents (CNN/DailyMail between 2007 and 2015) converted into Cloze format by removing some of the text’s named entities, and using them as keys.
        | In HELM: Uses above documents (in complete form) as text to summarize, and their highlights as gold summaries. | [Paper](https://arxiv.org/pdf/1506.03340.pdf) | [Hugging Face](https://huggingface.co/datasets/cnn_dailymail)
        [Data](https://cs.nyu.edu/~kcho/DMQA/) | (I suspect this does not produce very good summaries though) | +| CommonsenseQA | Common Sense, Reasoning | 12K turked Q/A (initialized from ConceptNet associations), then filtered by quality, with added context from a Google Search query > Some text likely overlaps with CC data | Answer the MCQA | [Paper](https://aclanthology.org/N19-1421/) | | Best results with an external calculator added | +| Contrast Sets | Generation, Robustness | 10 contrast sets of up to 1K examples, for their datasets (see comments) made by (often the original paper’s) researchers (increase reasoning steps in questions, replace words by their opposites, change counts …). | Follow original task setup with new examples, and see how/if performance drops.
        In HELM: use the IMDb and DROP contrast sets | [Paper](https://aclanthology.org/2020.findings-emnlp.117/) | [Data](https://allenai.org/data/contrast-sets) | NLVR2, IMDb sentiment analysis, MATRES Temporal RE, English UD parsing, PERSPECTRUM, DROP, Quoref, ROPES, BoolQ, MC-TACO.

        Dataset construction is way more detailed in Appendix | +| COPA | Common Sense, Language Understanding | 1K premise + causal questions with alternatives (common sense) | | [Paper](https://people.ict.usc.edu/~gordon/publications/AAAI-SPRING11A.PDF) | [Website](https://super.gluebenchmark.com/tasks) | | +| CoQA | In-Context Reading Comprehension | 127K Conversational QA, from a given context (rationale must be provided too) - written by annotators | | [Paper](https://arxiv.org/abs/1808.07042) | v1.0 from [Data](https://stanfordnlp.github.io/coqa/) | | +| DataImputation | Real life task, Reasoning, Structured data | 8 structured datasets from several sources. | Task: from a row of attributes with gaps, the model must fill the gaps (ex: extrapolating city from phone number, phone brand from its specs).
        | [Paper](https://sxsong.github.io/doc/21icde-imputation.pdf) | [Data restaurant](https://www.cs.utexas.edu/users/ml/riddle/data/restaurant.tar.gz)
        [Data Buy](https://dbs.uni-leipzig.de/file/Abt-Buy.zip) | See table 2 for all sources.
        In HELM: use the subsets Buy and Restaurant, convert input to natural language, test accuracy.| +| Digits arithmetics (2D+, 2D-, 3D+, 3D-, …) | Arithmetic | Basic arithmetic tasks for n digits addition, subtraction, composite operations with 2K example each | Task: solve the math | [Paper](https://arxiv.org/pdf/2005.14165.pdf) | [Github](https://raw.githubusercontent.com/openai/gpt-3/master/data/) | All links come from the lm-evaluation-harness/lm_eval/datasets/arithmetic | +| DROP | Arithmetic, In-Context Reading Comprehension | 55K adversarial questions which require 1) selecting relevant items from the text and 2) computing on them (sorting/counting/…) to get the correct answer | Task: select and count to provide the correct answer| [Paper](https://aclanthology.org/N19-1246/) | [Data](https://allenai.org/data/drop) | | +| Dyck language_HELM | Symbolic manipulation | 500 D_n words between 52 and 100 characters (”word” made of nested brackets/parenthesis) where the last i characters have been removed. | Task: Predict the unique sequence of closing parentheses. | [Paper](https://arxiv.org/abs/2211.09110) | [Github](https://github.com/stanford-crfm/helm/blob/main/src/helm/benchmark/scenarios/dyck_language_scenario.py)
        Also has a different version in BigBench | | +| HellaSwag | Cloze/Completion | 60K adversarially filtered multiple choice Q/A | Choose the correct next sentence (from captions or WikiHow) | [Paper](https://aclanthology.org/P19-1472/) | [Github](https://github.com/rowanz/hellaswag/tree/master/data) | | +| HumanEval | Code task, Text-to-code | 164 hand written programming problems with function signature, docstring, body + unit tests | Aim is to complete function to pass unit tests | [Paper](https://arxiv.org/abs/2107.03374) | [Hugging Face](https://huggingface.co/datasets/openai_humaneval) | | +| IMDB | Sentiment Analysis | 50K reviews from IMDB, with even positive (score ≥ 7) /negative (score ≤ 4) reviews (no neutral). | Classify positive/negative review. | [Paper](https://aclanthology.org/P11-1015/) | [Website](https://ai.stanford.edu/~amaas/data/sentiment/) | | +| LAMBADA | Cloze/Completion | 10K Narrative contexts (from the BookCorpus) followed by a sentence where the last word is masked and must be predicted. Specifically built to force use of the context. | Predict the last word. | [Paper](https://aclanthology.org/P16-1144/) | [Zenodo](https://zenodo.org/record/2630551#.YFJVaWT7S_w) | | +| Language Modeling Evaluation_HELM | Language Modeling | Compilation in HELM of several datasets: WikiText-103, ThePile (particularly arXiv, BooksCorpus2, Enron Emails, PubMed Central, Wikipedia), TwitterAAE, ICE. | Task: get conditional log probability of the full sequence (perplexity measure) | [Paper](https://arxiv.org/abs/2211.09110) | [The pile website](https://pile.eleuther.ai/ )
        [BLIMP Github](https://github.com/alexwarstadt/blimp )
        [Wikitext data ](https://s3.amazonaws.com/research.metamind.io/wikitext/wikitext-103-raw-v1.zip )
        [Twitter AAE data](http://slanglab.cs.umass.edu/TwitterAAE/ )
        [ICE data](https://www.ice-corpora.uzh.ch/en/access.htm) | | +| LegalSupport | Entailment, Real life task, Reasoning | 20K legal entailment scenarios, constructed from state/federal legal opinions (assertion is used as context, and 2 supporting sources (”see X, rule”) are selected at random). | Task: finding which rule best supports assertion. | [Paper](https://arxiv.org/abs/2211.09110) | [Data](https://docs.google.com/uc?export=download&id=1PVoyddrCHChMxYrLhsI-zu7Xzs5S8N77) | | +| LinuxKernel_HELM| Generation, Memorisation | 2K randomly sampled functions from the Linux Kernel. | Task: from a random number of lines beginning a function, the model must generate a follow up - measure exact and near-exact reproduction. | [Paper](https://arxiv.org/abs/2211.09110) | [Data](https://drive.google.com/file/d/1Y5piYwil7T6n8toT_-d7NWqVZHh9NVxJ/view) | | +| LSAT | Analytical Reasoning, In-Context Reading Comprehension, Logical Reasoning | 10K questions from the Law School Admission Test (analytical, logical reasoning, and reading comprehension), with context. | Answer the MCQA correctly | [Paper](https://arxiv.org/pdf/2108.00648.pdf) | [Github](https://github.com/zhongwanjun/AR-LSAT/tree/main/data) | | +| Magellan Benchmark | Real life task, Reasoning, Structured data | 23 datasets from several sources containing entities with attributes. Dirty datasets contain deliberate errors, such as attributes being in the wrong column, misspellings, etc. | Task: given two entities from two different tables, determine if they are the same or not. | [Paper](https://pages.cs.wisc.edu/~anhai/papers1/deepmatcher-sigmod18.pdf) | [Github](https://github.com/anhaidgroup/deepmatcher/blob/master/Datasets.md) | It’s likely that Abt-Buy and Buy are the same dataset | +| MBPP | Code task, Text-to-code | 1K entry-level Python crowd-sourced programming problems (description, solution, 3 unit test cases) - (58% mathematical, 43% list processing, 19% string processing, 9% integer sequences, and 2% other). | Solve the Python program | [Paper](https://arxiv.org/abs/2108.07732) | [Github](https://github.com/google-research/google-research/tree/master/mbpp )
        [Hugging Face](https://huggingface.co/datasets/mbpp) | Also contains an edited version (400 items) with unambiguous prompts and good signatures (can be interesting to look at later to see the impact of prompts on code gen) + a MathQA-Python dataset (adaptation of the MathQA dataset) | +| MMLU | Language Understanding | 15K multi-choice Q/A manually collected from various online sources, on many topics (legal, philosophy, economics, psychology, STEM, medicine, etc, - at high school to professional level) | Answer the MCQA | [Paper](https://arxiv.org/abs/2009.03300) | [Hugging Face](https://huggingface.co/datasets/lukaemon/mmlu )
        [Github](https://github.com/hendrycks/test ) | Seems like a strong/high-quality baseline | +| MRF (Misinfo Reaction Frames) | Generation, Misinformation capabilities | 200K pairs of claims from news headlines (climate change, covid 19, cancer illness, detailed sources in comments) + label (real/misinformation), the former annotated on veracity, likelihood of spread, writer intent by MTurk workers. | Task: must either predict the gold label or generate likely writer intent/reader perception/… | [Paper](https://aclanthology.org/2022.acl-long.222/) | [Github](https://github.com/skgabriel/mrf-modeling) | (Contains data from NELA-GT-2018-2020, SciDCC, Climate-FEVER, CoAID, CoronaVirusFacts/DatosCoronaVirusAlliance Database, ESOC Covid-19 Misinformation Dataset, DETERRENT) | +| MS MARCO | Question Answering, Retrieval | 1M anonymised questions with free-form human generated answers (from relevant web document extracts), some with added rewriting.
        | Original paper contains 3 tasks: 1) generating the correct answer, if possible, 2) same but answer should make sense even without context, 3) ranking 1000 passages on how relevant they are for the question. | [Paper](https://arxiv.org/abs/1611.09268) | [Github](https://microsoft.github.io/msmarco/) | Contains extended descriptions of QA datasets in lit review.
        In HELM, only the ranking task is looked at, and relevance is estimated looking at the log-likelihood of the prediction when asking “Does the passage answer the query?” | +| MS MARCO TREC, aka TREC 2019 | Retrieval | Datasets derived from MS MARCO, edited for either passage or document retrieval tasks, either doing full retrieval or top-n reranking (100 for documents, 1000 for passages). (see MS MARCO) | | [Paper](https://arxiv.org/abs/2003.07820) | [Data](https://trec.nist.gov/data/deep2019.html)
        [Github](https://microsoft.github.io/msmarco/TREC-Deep-Learning-2019.html) | | +| MultiRC | Language Understanding, Question Answering | 6K multiple choice question over a diversity of topics | | [Paper](https://aclanthology.org/N18-1023.pdf) | [Data](https://super.gluebenchmark.com/tasks) | | +| NarrativeQA | Question Answering, Retrieval | 47K free-form human generated questions and answers, linked to 1.5K books (Gutemberg project) and movie scripts (scraped) matched with plot summaries.
        | Task: from the summary or the story, answer or select the correct answer. | [Paper](https://arxiv.org/abs/1712.07040) | [Github](https://github.com/deepmind/narrativeqa) | For long range context testing, we could use this dataset to do QA from the full stories. Could be interesting for anything conversational imo. | +| Natural Questions | Open domain/Closed book, Question Answering| 207K aggregated google search queries + annotated wikipedia sample answer | | [Paper](https://aclanthology.org/Q19-1026/) | [Data](https://ai.google.com/research/NaturalQuestions/download) | | +| NewsQA | Question Answering | 100K human generated QA pairs from 12K news articles (CNN). Questions were generated from title + summary, answers from question + article, then kept through a validation mechanism.
        Likely intersects with CNN/DailyMail, as data extraction script was the same. | | [Paper](https://aclanthology.org/W17-2623/) | [Github](https://github.com/Maluuba/newsqa) | | +| OpenBookQA | Common Sense, Reasoning | 6K sentences, science reasoning needing common sense knowledge to extrapolate to new situations | | [Paper](https://arxiv.org/abs/1809.02789) | [Data](https://allenai.org/data/open-book-qa) | | +| PIQA | Common Sense, Reasoning | 20K physical common sense reasoning situations, | select the correct action to do from a context and answers | [Paper](https://arxiv.org/abs/1911.11641) | [Data](https://yonatanbisk.com/piqa/data/) | | +| PopularBooksCorpus_HELM | Generation, Memorisation | 20 books from BooksCorpus which appear in a list of bestsellers. | Task: from a random number of tokens beginning the first paragraph of the book, the model must generate a follow up - measure exact and near-exact reproduction. | [Paper](https://arxiv.org/abs/2211.09110) | [Data](https://drive.google.com/file/d/1RT29rRKNNXKgZBhXNbqevLwR440g44it/view) | | +| QuAC | In-Context Reading Comprehension | 100K questions in information seeking QA contexts (used Wikipedia to generate dataset) | | [Paper](https://aclanthology.org/D18-1241/) | [Data](https://quac.ai/) | | +| RACE | In-Context Reading Comprehension | 100K questions from English reading comprehension exam for Chinese mid/high school students | | [Paper](https://aclanthology.org/D17-1082/) | [Data](https://www.cs.cmu.edu/~glai1/data/race/) | | +| RAFT | Real life task, Text classification | Compilation of 11 datasets of naturally occurring classification tasks, of between 150 and 5K test items. | Task: in few shot from 50 labeled examples, provide meaningful labels. (Domains: medical, finance, research, english language, law, physics, AI safety, social networks) | [Paper](https://arxiv.org/abs/2109.14076) | [Hugging Face](https://huggingface.co/datasets/ought/raft) | Corpus: (ADE Corpus v2, Banking77, NeurIPS 2020 impact statement risks, OneStopEnglish, Overrruling, Systematic review inclusion, TAI safety research, Terms of Service, TweetEval Hate, Twitter complaints, + Semiconductor org types, created for this) | +| RealToxicityPrompts | Generation, Toxicity detection | 100K natural occurring sentences (selected from OpenWebText corpus, basically = reddit, and scored for toxicity with the PerspectiveAPI) split in two to create a prompt and continuation. | Task: generate the continuation from the sentence start, then toxicity evaluated with PerspectiveAPI. | [Paper](https://arxiv.org/abs/2009.11462) | [Data](https://allenai.org/data/real-toxicity-prompts)
        [Github](https://github.com/allenai/real-toxicity-prompts) (the repo lacks a lot of info) | | +| ReCoRD | Language Understanding | 120K passage/cloze query/answer examples from news (CNN, DailyMail) with human filtering | | [Paper](https://arxiv.org/abs/1810.12885) | [Data](https://super.gluebenchmark.com/tasks) | | +| RTE | Language Understanding | 3K compilation of competition data on entailement | | [Paper](https://w4ngatang.github.io/static/papers/superglue.pdf) | [Data](https://super.gluebenchmark.com/tasks) | | +| SAT analogies | Language Understanding | 374 SAT analogy problem prior to 2005 (a is to b what c is to multiple choice questions; words are not the most frequent) | | [Paper](https://arxiv.org/pdf/2005.14165.pdf) | [Data dev](https://goo.gl/XWjas1)
        [Data test](https://goo.gl/BcTtB4) | | +| SIQA | Question Answering | | || | | +| SQuADv2 | In-Context Reading Comprehension | Combines SQuAD with 50K unanswerable questions | from a context, give an answer, but only if possible| [Paper](https://arxiv.org/abs/1806.03822) | [Github](https://rajpurkar.github.io/SQuAD-explorer/) | | +| StoryCloze | Cloze/Completion, Common Sense | 50K 5-sentences commonsense stories | choose the correct ending | [Paper](https://aclanthology.org/N16-1098/) | [Hugging Face](https://huggingface.co/datasets/story_cloze) | | +| StrategyQA | Common Sense, Reasoning | 2.8K questions needing reasoning from implicit knowledge | | [Paper](https://arxiv.org/abs/2101.02235) | | Best results with an external calculator added | +| Synthetic reasoning (natural) | Logical Reasoning, Reasoning | Synthetic data generated on the fly, containing a set of synthetic rules (conditional statements), facts (attributes), and the logical gold output. | | [Paper](https://arxiv.org/abs/2211.09110) | Can be generated with [Github](https://github.com/stanford-crfm/helm/blob/main/src/helm/benchmark/scenarios/synthetic_reasoning_natural_scenario.py) | Also called rule_induct in HELM | +| Synthetic reasoning (symbolic)_HELM | Logical Reasoning, Symbolic manipulation | Synthetic data generated on the fly using a pattern template. | Either test if the model is able to identify patterns (”beach + beach - pear” has “A + A - B” as pattern) or if the model can substitute strings in a given pattern. | [Paper](https://arxiv.org/abs/2211.09110) | Can be generated with [Github](https://github.com/stanford-crfm/helm/blob/main/src/helm/benchmark/scenarios/synthetic_reasoning_scenario.py) | | +| TriviaQA | Open domain/Closed book, Question Answering| 95K trivia QA (compositional questions, syntactic variability) | | [Paper](https://aclanthology.org/P17-1147/) | [Data](https://nlp.cs.washington.edu/triviaqa/) | | +| TruthfulQA | Question Answering | 817 questions about tricky factual claims (common misconceptions, falsehoods, …) over 38 categories, with true and false reference answers + a source to support true answers (+ 380 added questions). | | [Paper](https://arxiv.org/abs/2109.07958) | [Github](https://github.com/sylinrl/TruthfulQA) | | +| TyDiQA-GoldP | Multilingual, Question Answering | 204K multilingual QA pairs (unconstrained question elicitation from prompts, then Wikipedia article retrieval, and specific answer selection in the article if possible) (en, ar, ben, fin, ind, ja, ko, ru, tel, th and kiswahili). | MCQA | [Paper](https://aclanthology.org/2020.tacl-1.30/) | [Github](https://github.com/google-research-datasets/tydiqa) | Dataset generated can present interesting underspecification of questions and mismatch between question and answers language level. Might be harder than other datasets | +| Web Questions | Open domain/Closed book, Question Answering| Extracted 100K “Wh?” questions from Google Search API, then annotated by MTurkers - I suspect answers are partially out of date | MCQA | [Paper](https://aclanthology.org/D13-1160/) | [Website](https://nlp.stanford.edu/software/sempre/)| | +| WebNLG | Generation, Verbalization | 13K mappings between triples (subject, property, object, constructed from DBPedia, which is a KB from Wikipedia) and sentence verbalization (by crowd workers), about specific topics (astronauts, universities, monuments, buildings, characters from comics, food, airports, sports teams, written works). | Task: verbalize in a grammatical way. | [Paper](https://aclanthology.org/P17-1017.pdf) | [Hugging Face](https://huggingface.co/datasets/web_nlg) |
        There was a sentence selection for fluency and the sentences generated are relatively simple, but there is no description of annotators/crowdsourcers origins > maybe some data is not in “standard English”. | +| WiC | Language Understanding | 7K, classification of whether a word occurring in two different contexts has the same meaning or not | | [Paper](https://aclanthology.org/N19-1128/) | [Site](https://super.gluebenchmark.com/tasks) | | +| WikiFact_HELM | Cloze/Completion | 12 domains with 1K triples (subject, relation, object) sampled from Wikipedia and cleaned. | Task: predict missing item in the sentence made of the relation. | [Paper](https://arxiv.org/abs/2211.09110) | [Codalab](https://worksheets.codalab.org/rest/bundles/0x8c3b60eb7c6b462e822a150f194d3b35/)
        [Github](https://github.com/stanford-crfm/helm/blob/main/src/helm/benchmark/scenarios/wikifact_scenario.py) | | +| WikiLingua | Generation, Multilingual, Summarization | 43K article/summary pairs constructed from WikiHow in 18 languages (on the site, articles are written with a summary sentence + detailed paragraph per step: in the dataset, summaries are the concatenation of the summary sentences, and articles of the detailed paragraphs| Summarization | [Paper](https://aclanthology.org/2020.findings-emnlp.360/) | [Github](https://github.com/esdurmus/Wikilingua ) | Palm: Prefixed with a prompt, truncated article to 2048 tokens
        I suspect data creation can leads to very “robotic” language for the summary baseline, which could underscore more fluid summaries - though ROUGE shouldn’t be too prone to that). | +| Winogender | Bias detection | | || | | +| Winograd | Reasoning, Winograd | 273 to 285 examples where one must disambiguate who/what a pronoun is referring to on sentence specially constructed to be ambiguous to statistics not to humans | Disambiguation of pronoun | [Paper](https://dl.acm.org/doi/10.5555/3031843.3031909) | [Website](https://cs.nyu.edu/~davise/papers/WinogradSchemas/WSCollection.xml) | Not sure if GPT3 was evaled on this one or the SuperGLUE one | +| WinoGrande | Reasoning, Winograd | 43K sentences Adversarial Winograd | | [Paper](https://arxiv.org/abs/1907.10641) | [Website](https://winogrande.allenai.org/) | | +| WSC | Language Understanding, Winograd | WinoGrad Schema Challenge (see Winograd) | | [Paper](https://w4ngatang.github.io/static/papers/superglue.pdf) | [Website](https://super.gluebenchmark.com/tasks) | | +| XSUM | Summarization | 226K news articles (BBC, 2010 to 2017) matched with their single sentence summary (comes from the article). Task: Summarize. (Domains: News, Politics, Sports, Weather, Business, Technology, Science, Health, Family, Education, Entertainment and Arts) | | [Paper](https://aclanthology.org/D18-1206/) | [Github](https://github.com/EdinburghNLP/XSum) | | +| XSum | Generation, Summarization | 226K news summary/article pairs from the BBC (2010 - 2017) extracted from the WayBack machine | | [Paper](https://aclanthology.org/D18-1206/) | [Hugging Face](https://huggingface.co/datasets/xsum)| Could be interesting to manually check if the model recent knowledge creates discrepancies in the summaries of old news. | + +### Dataset ideas to manually reproduce + +| Evaluation name | Task type | Task content | Source | Dataset | Comments | | +| ------------------------------ | ---------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | --- | +| ✍️ GSM8K-Python | Code task, Text-to-code | Python version of the GSM8K dataset (8.5K grade school math problems) | [Paper](https://arxiv.org/abs/2204.02311) | N/A | | | +| ✍️ MRF | Generation, Manual evaluation, Misinformation capabilities | 250 headlines extracted from the MRF dataset, grouped in 80 clusters by thesis. Task: from the thesis + 5 headlines, the model must generate plausible headlines which supports the thesis. Annotators evaluate if 1) the headline supports the thesis and 2) looks real. | [Paper](https://arxiv.org/abs/2211.09110) | [Data](https://drive.google.com/uc?export=download&id=1uVJbsgPCHFAvH43I6SVvU3Ayo8dh-y_N) | See [report](https://cset.georgetown.edu/publication/truth-lies-and-automation/) page 6 for a detailed explanation of the original process, plus sections 8.5.2, E.5, and 5.5 in the HELM paper. | | +| ✍️ News article generation | Generation | Generated 25 articles from titles and subtitles, 80 humans had to classify if generated or original | [Paper](https://arxiv.org/abs/2005.14165) | | | | +| ✍️ Numeracy Prediction | Symbolic manipulation | “requires the model to perform symbolic regression given a few examples, and apply the number relationship to a new input” | | [Paper](https://arxiv.org/abs/2211.09110) | [Github](https://github.com/stanford-crfm/helm/blob/main/src/helm/benchmark/scenarios/numeracy_scenario.py) | | +| ✍️ SVG datasets | | Could construct an SVG dataset to see if models can indeed generate or interpret SVG drawings | [Twitter thread](https://twitter.com/zswitten/status/1631178997508997120) | | | | +| ✍️ Theory of the mind datasets | | Could likely be easy to generate | [Paper](https://arxiv.org/abs/2302.08399) | | | | +| ✍️ Wedging prompts | Generation, Manual evaluation, Misinformation capabilities | 11 prompts with specific intent (ex: influence voting behaviors, target specific groups by generate pro/anti X rhetoric) augmented with 3 examples. Task: generate follow up examples.
        | [Paper](https://cset.georgetown.edu/wp-content/uploads/CSET-Truth-Lies-and-Automation.pdf) | [Data](https://drive.google.com/uc?export=download&id=1kWB3_F4Tobc_oVGC_T-a5DHEh-AB4GTc) | In HELM: use manual evaluation to determine if the message generate 1) addresses targeted group; 2) supports desired message; 3) is divisive | | +| ✍️ Word scrambling | Symbolic manipulation | 10K examples for 5 tasks of 5 character manipulation tasks (word with cycled letters, anagrammed, random insertions, reversed). Model needs to recover the original word | [Paper](https://arxiv.org/abs/2005.14165) | | Easy to generate/automate, see Section 3.9.2 of GPT3 paper | | diff --git a/app/src/content/chapters/general-knowledge/2025-evaluations-for-useful-models.mdx b/app/src/content/chapters/general-knowledge/2025-evaluations-for-useful-models.mdx new file mode 100644 index 0000000000000000000000000000000000000000..c8ac25f7465b2dcf9a36e909f34b360decd1ceaa --- /dev/null +++ b/app/src/content/chapters/general-knowledge/2025-evaluations-for-useful-models.mdx @@ -0,0 +1,173 @@ +--- +title: "2025 evaluations" +--- + +import Note from "../../../components/Note.astro"; +import Sidenote from "../../../components/Sidenote.astro"; + +You can evaluate **specific capabilities** on their own - it's usually quite interesting to get signal when training, or when comparing base/pretrained models. (However, if you select and validate your training methods with the following evaluations, reporting on them on the final model is slightly biased as you have already oriented your training method towards good results on them). + +#### Reasoning and commonsense + +Reasoning and commonsense datasets are often “historic” datasets, built in the age of BERT and embeddings model, before the LLM craze. They were quite challenging at the time (especially because they were often adversarially built for models of the time), but now they are 1) too easy 2) contaminated/saturated, and should only be used for ablations or as pretraining evaluations. The bigger datasets also sometimes contain errors or low quality questions as they tend to have been built through Amazon Mechanical Turk in order to scale up fast and at low cost (what is now done by using LLMs to generate evaluation questions). + +[ARC](https://arxiv.org/abs/1803.05457) (2018) (not to confuse with ARC-AGI) is a grade school science MCQA dataset built from human tests. The choices were selected adversarially for word co-occurence systems at the time. It has several subsets, the higher quality `challenge` one is still in use today for pretraining. [WinoGrande]([https://arxiv.org/pdf/1907.10641](https://arxiv.org/pdf/1907.10641)) (2019) is a crowdsourced (mechanical turk + validation) pronoun resolution/fill in the blank dataset, using adversarial pairs of items to trick models. Both these datasets have been quite hard for models until 2022 to 2023. + +A number of historic datasets are looking specifically at reasoning requiring some sort of commonsense understanding and grounding. [HellaSwag]([https://arxiv.org/abs/1905.07830](https://arxiv.org/abs/1905.07830)) (2019) requires LLMs to select the correct next sentence in a list of adversarial choices, where the text comes from captions in ActivityNet and from tutorials in Wikihow. (It’s the follow up of a dataset called Swag). As most sentences come from tutorials or descriptions of activities, they often require physical commonsense grounding to solve. In the same vein, [CommonsenseQA]([https://arxiv.org/abs/1811.00937](https://arxiv.org/abs/1811.00937)) (2018) is a dataset of commonsense MCQA built from ConceptNet - annotators write questions, then use conceptually close distractors as options. [PIQA]([https://arxiv.org/abs/1911.11641](https://arxiv.org/abs/1911.11641)) (2019) is specifically looking at physical commonsense questions (created from examples from [Instructables.com](http://Instructables.com), with again adversarial choices from semantic perturbations or rewriting). [OpenBookQA]([https://arxiv.org/abs/1809.02789](https://arxiv.org/abs/1809.02789)) (2018) provides open book facts to help answer MCQA questions - however, these questions also require latent common sense knowledge. + +A more recent cool reasoning dataset is [Zebra Logic](https://arxiv.org/abs/2502.01100), using logic puzzles to test model reasoning capabilities. Their methods allows for infinite generation of puzzles, so little contamination. + +#### Knowledge +The main evaluation dataset for knowledge has been [MMLU](https://arxiv.org/abs/2009.03300) (2020). It reached saturation/contamination, and after more in depth examination, a number of issues were identified: incomplete questions referring absent documents, incorrect ground truths, ambiguous questions, and blatant americano-centrism in the topics chosen. It was therefore cleaned in [MMLU-Redux](https://arxiv.org/abs/2406.04127) (2024), extended with more complex questions and more answers in [**MMLU-Pro**](https://arxiv.org/abs/2406.01574) (2024, the main replacement used by the community at the moment), and translated/annotated for cultural bias in [Global-MMLU](https://arxiv.org/abs/2412.03304) (2024). These are used mostly for pretraining evaluations and ablations. + +For post training, people look at harder high quality knowledge dataset. [**GPQA**](https://arxiv.org/abs/2311.12022) (2023), custom PhD level questions in biology/chemistry/physics, made to be answerable by PhD students in the correct domain and not otherwise. The most used subset is the `diamond` one, but since its publication in 2023 it has also started reaching contamination. + +Last but not least, the pompously named but very high quality [**Humanity's Last Exam**](https://agi.safe.ai/) (2024) contains 2.5K crowdsourced questions by experts in their field, across domains. It is mostly private, and questions require both complex knowledge and reasoning. It has not been broken yet, and it's imo a cool dataset. The only issue is that since there is no way to get a model scored fast, people now evaluate against it by using an LLM judge to assess their answers, insted of checking against ground truth, so it's one of these evaluations where you'll get really uncomparable results in the wild. + +However, though testing models for the raw quality of their latent knowledge made a lot of sense a couple years back (and is still interesting while training to test model quality, with evals like MMLU-Pro during pretraining and GPQA/HLE for post training), I think we will slowly phase out of benchmarks such as this in the next years, for 2 reasons. + +1. They are becoming more and more indecipherable for humans: questions are becoming so complex that it's almost impossible for non experts to understand what performance on each question means (and to make sure the datasets themselves do not contain mistakes) +2. Now that our models are connected to tools, such as internet access, latent knowledge evaluations are increasingly becoming web search and retrieval evaluations, so they make less sense as such. In short, we're moving from closed book to open book evaluations. As a comparison, in the French school system, you get closed books examinations in high school, but as you enter university, it's often assumed that you will get access to databases, internet, and scoring becomes less about what you learnt by heart, and more about how you reason given free access to information. I believe this is also a change we will see in LLM evaluation with the increase of model capabilities. + +#### Math +Math evaluation datasets have been used as proxies for reasoning and logic benchmarking, independently of, obviously, also checking if models can solve math problems. + +The two reference math evaluation datasets were [GSM8K](https://arxiv.org/abs/2110.14168) (2021), containing grade school math problems and [MATH](https://arxiv.org/abs/2103.03874) (2021), an aggregation of Olympiad problems present on the web, which reached saturation/contamination in the last years. The former was extended by [GSM1K](https://arxiv.org/abs/2405.00332) (2024), a recreation with 1K new problems, to test which models were contaminated on the former, [GSM-Plus](https://arxiv.org/pdf/2402.19255), a rewriting of models with adversarial changes (distractors, numerical variations, and so forth) and [GSM-Symbolic](https://arxiv.org/abs/2410.05229) (2024), less used, but a very interesting re-writing of GSM8K as problem templates, to prevent contamination: problems can be regenerated ad infinitum. + +Community has now been focusing on using: +- The follow ups to MATH, either [**MATH-500**](https://huggingface.co/datasets/HuggingFaceH4/MATH-500) (a representative subset of 500 problems sampled to avoid overfitting) and MATH-Hard (only the 500 hardest questions) +- **AIME** ([24](https://huggingface.co/datasets/HuggingFaceH4/aime_2024), [25](https://huggingface.co/datasets/math-ai/aime25)), american olympiad datasets for high schoolers, taken as is at publication. These datasets are interesting because, since they are made of problems renewed every year with equivalent difficulty, they allow testing for contamination by comparing results at publication with results on the previous year's dataset +- [**Math-Arena**](https://matharena.ai/), an up to date compilation of competitions and olympiads actualised regularly (it contains AIME25, but a lot of other competitions too!) + +Most of these datasets are actually no longer "that hard", since they stop at grade school level (even though GSM-Symbolic allows to generate problems with more recursion levels, making them synthetically harder). On the other side of the spectrum, [FrontierMath](https://arxiv.org/abs/2411.04872) (2024) was an attempt at providing considerably harder math problems, written individually by mathematicians for the occasion. The dataset was theoretically private (but it appeared OpenAI has had access to parts of the dataset - such a shame). [Humanity's Last Exam](https://agi.safe.ai/) (2025) (introduced in the knowledge section) also contains interesting “made for the occasion” math problems requiring complex reasoning (notably some theorem proving). + +I would personally use AIME25 and MATH-500 for pretraining evaluations, and the Math-Arena for post training. + +#### Code +Since agents need to interact with tools, they need coding abilities, either to call tools directly if they are code agents, or understand how to debug tool output in case of problems (for code and json agents both, see the difference [here](https://huggingface.co/learn/agents-course/en/unit2/smolagents/tool_calling_agents)). Coding evaluation sets are also good proxies for reasoning. + +Historically in 2021, code evaluation sets were [MBPP](https://arxiv.org/abs/2108.07732), 1K crowdsourced Python only entry-level programming problems, [APPS](https://arxiv.org/abs/2105.09938), 10K code generation problems curated from programming interviews and sharing websites, and [HumanEval](https://arxiv.org/abs/2107.03374), introduced with the Codex model, which contrary to the previous is made of "specifically made for the release" problems, which was super neat then! It also came with a sandbox to avoid problematic code execution on the evaluator's machine. (Last thing this paper introduced was an estimator for `pass@k`, which before that was computed with a literal check on whether an evaluation was a success more than k times on n). + + +The [EvalPlus](https://openreview.net/pdf?id=1qvx610Cu7) (2023) team made HumanEval+ and MBPP+, extensions of the former, by adding more test cases and fixing bugs in the original datasets as well as adding more inputs. [EvoEval](https://arxiv.org/abs/2403.19114) (2024) also introduced a variation on HumanEval by semantically rewriting the problems and adding difficulty labeling. + +For final models, you might want harder or uncontaminated problems. + +[**LiveCodeBench**](https://arxiv.org/abs/2403.07974) (2024) follows a similar "grabbing from leetcode websites" approach, but is very interesting because it stores the problem date, to compare model performance on problems created before and after they finished training. This was an excellent contamination free benchmark, and I'm looking forward to an update! + +[**AiderBench**](https://aider.chat/docs/leaderboards/) (online since end of 2024 I think?) also uses data from existing coding websites (Exercism to be specific), but goes beyond problem solving by testing specifically code editing and refactoring. + +For post training, you want more holistic evaluations, and a couple benchmarks moved beyond evaluation on standalone problems, which were not evaluating complex coding abilities. [RepoBench](https://arxiv.org/abs/2306.03091) (2023) tests repository level auto completion systems in Python or Java, using code from Github as source. It was built by masking random lines in code bases and asking for completions, either a cross file or in file function, and defines several tests level (retrieval, completion, a combination). + +[**SweBench**](https://openreview.net/pdf?id=VTF8yNQM66) (2024) is a more well known and complete version of this, also using github, but this time testing if models can solve existing issues, so logic understanding, cross file editing and execution, long context reasoning, etc. + +[**CodeClash**](https://codeclash.ai/) (2025) is the coding version of an arena, where models write code which competes against other models code, edit, and iterate. + +At this time, I would recommend following LiveCodeBench, AiderBench and the higher quality subset of SWE-Bench (SWE-Bench verified), and reading the [METR report](https://metr.org/blog/2025-07-10-early-2025-ai-experienced-os-dev-study/) on actual code assistant usefulness. + +#### Long context +To correctly interact with users over a long discussion, without losing track, you need good long context management. (Funny to think that 3 years ago, maximum context lengths for models were 2048 tokens, when now we're largely at 128K and beyond). + +The evaluation which started testing this in 2023 is probably [NIAH](https://github.com/gkamradt/LLMTest_NeedleInAHaystack), (Needle in a Haystack), where you place a random fact in a long unrelated text and ask the model to retrieve it. It provides a neat framework to evaluate where in the context a model is most likely to forget stuff, and from which context length. In 2023 models were really bad at it, in 2025 it's close to solved. + +More complex long context extensions have emerged since. [RULER](https://arxiv.org/pdf/2404.06654) (2024) adds multi-hop tracing (requiring the model to follow chains of variables to get the correct value), word frequency changes, and adds a QA variation of NIAH. it's also close to solved now. [Michelangelo](https://arxiv.org/pdf/2409.12640v2) (2024, also sometimes called MRCR for multi round co reference) is also using synthetic long context data: tasks (of varying length) test whether models can reproduce precisely unique portions of the context (as well as identify if relevant information is present) and understand sequence of modifications to a text. It was then extended in the [OpenAI MRCR](https://huggingface.co/datasets/openai/mrcr) (2025). [InfinityBench](https://arxiv.org/abs/2402.13718) (2024) is multilingual (En and Zh), and provides 100K tokens synthetic data tasks, across a variety of objectives (QA, retrieval as in NIAH, computations over very long context, ...). InfinityBench still provides some signal. + +[**HELMET**](https://arxiv.org/abs/2410.02694) (2024) combines tasks and existing benchmarks to get a big single dataset with more signal: RAG and QA datasets (Natural questions, TriviaQA, PopQA, HotpotQA, Narrative QA and InfinityBench), recall (RULER and JSONKV), generation with citation (subsets of ALCE), summarisation, reranking passages (MS MARCO), in context learning (TREC, NLU, Banking77, CLINIC150). Benchmark aggregations are exhaustive but present the risk of measuring things two times : don't go testing your model against both HELMET and InfinityBench, then aggregating the results, for example, as you would run the same evaluation twice! In 2025, it still has enough discriminative power to compare models. + +My favorite long context evaluations ideas are the [Novel Challenge](https://arxiv.org/abs/2406.16264) (2024), 1K true/false claims about fictional books published in the last year (by readers of said books!) requiring having read and understood the full text to answer properly, and the [**Kalamang translation dataset**](https://arxiv.org/abs/2309.16575) (2024), where models need to properly translate from English to Kalamang from reading a grammar book (Kalamang is such a low resource language that it has no online presence - only 200 speakers). The Kalamang translation set could notably be expanded to other low resource languages (but it would be cool to expand to use a rule based grammar checker to test generation validity to get strict accuracy instead of relying on BLEU...). + +#### Instruction Following +The two main instruction following datasets are [**IFEval**](https://arxiv.org/abs/2311.07911) (2023) and its extension [**IFBench**](https://arxiv.org/abs/2507.02833) (2025). IFEval is one of the smartest evaluation ideas in the last years, in my opinion: models are asked to follow formatting instructions (about keywords, punctuation, number of words/sentences, file type formatting such as markdown or html, etc). Each of these conditions can be checked with a specific parsing test: this means that this evaluation is one of the rare free form generative evaluation where you can get a strict score without relying on a model judge. + +More generally, it falls into the functional correctness/unit test evaluation type, which is my personal favorite way to evaluate models. It's also very easy to regenerate or extend to prevent contamination. + +Side note, but some benchmarks also test "non instruction following" (non compliance): [CoCoNot](https://www.arxiv.org/pdf/2407.12043) (2024) notably tests if models will or won't comply with incomplete (underspecified/unclear), unanswerable (by lack of information or AI-humanizing, often hallucinations triggering), or unsafe requests. It used manual queries writing, models to write non compliants requests, then filtered to create an eval set presented as a classification problem. + +#### Tool-calling +The emergence of tools is one of the features which started moving LLMs into the agentic realm. + +[**TauBench**](https://arxiv.org/pdf/2406.12045) (2024) evaluates a model on its ability to answer a user's query in the retail and airline domains (order/book/look for products/etc). The database mimics real domain data with synthetic samples, and the model is considered correct when 1) its actions updated the database correctly and 2) it answered the user appropriately. To make this benchmark automatic, the user is mocked up by an LLM, which makes this evaluation quite costly to run and prone to errors. Despite these limitations, it's quite used, notably because it reflects real use cases well. + +[ToolBench](https://arxiv.org/pdf/2305.16504) (2023) require calling APIs (OpenWeather, Cat, HomeSearch, TripBooking, GoogleSheets, WebShop, Tabletop, etc) to solve 100 test cases across dataset, requiring between one and 10 tool calls to solve. Some of these APIs are mock ups and some of them are real, which makes the dataset susceptible to accidental failure. It was therefore fixed and extended in [StableToolBench](https://arxiv.org/pdf/2403.07714) (2025), which introduces a general VirtualAPIServer mocking up everything to ensure evaluation stability, however relying on an LLM judge for evaluation, introducing another layer of bias. + +[**BFCL**](https://openreview.net/pdf?id=2GmDdhBdDk) (2025, but the benchmark actually has a couple years) evolved considerably over the year, and in its current version contains 4 subset: single turn (simple tool calls), crowdsourced real life function calls from users, multiturn conversations (to test accuracy in long context and query answering with tool calls) and agentic (web search, memory, sql data interaction). It's using a combination of Abstract Syntax Trees, execution response and state matching (is the final state the expected one) to evaluate if calls are correct. People are focusing on the v3 to test tool calling specifically, and the v4 tests web and search tool use. + +Lastly, with the creation of MCPs, some benchmarks arose to test MCP oriented tool calling - however all mostly relying on model judges, and using real world APIs, which can introduce potential failure cases/lack of reproducibility due to network issues (seems like added load for website creators is not too much of an issue as the userbase of most MCP covered is big enough). + +[MCPBench](https://arxiv.org/abs/2508.20453) (2025) connects LLMs to live, real world MCP servers (Wikipedia, HF, Reddit, Steam, arxiv, ...) with tasks requiring multiple turns to solve (created synthetically). The evaluation combines rule based checks on tool call validity and success with an LLM judge to assess if queries were properly answered. + +[**MCP-Universe**](https://arxiv.org/abs/2508.14704) (2025) uses 11 MCP servers across varied real world topics (IRL navigation, 3D design, web search, etc). What’s cool in this one is that evaluation relies on several strict evaluators, one for format correctness, and two for answer correctness: as tasks can be static (asking things that do not change) or dynamic (github stars in a repo, weather, …), in the latter case answer correctness uses a task-dependant execution based evaluation framework which grabs the latest correct answer from the relevant source automatically and compares the model output to it. This is way neater than relying on LLM judge! + +[**LiveMCPBench**](https://arxiv.org/abs/2508.01780) (2025) provides a large locally deployable collection of MCP servers to test how good models are at discriminating between tools to accomplish tasks. Best models are already reaching 80% - so we're close to saturation. However, testing if models can select proper tools in very long lists is a good use case which will be increasingly important as the web goes mcp. + + +(By the way, here's a cool [doc](https://www.anthropic.com/engineering/writing-tools-for-agents) on how to write good tools.) + +While testing individual capabilities provides valuable signal, real-world assistant performance comes from how these capabilities combine. A model might excel at reasoning but fail when that reasoning must be integrated with tool calling and long context management simultaneously, so we need evaluations requiring the orchestration of multiple capabilities together. + +#### Assistant tasks +I believe that **assistant tasks** are going to be one of the main ways to do next level evaluations: solving them requires a combination of many capabilities (long context, reasoning, tool calling, ...), while the benchmarks themselves provide insight on specific domains performance in a useful real world setup. They also tend to be more understandable (by the general public) than specific capabilities benchmarks. If the benchmarks are general enough, they do not check which precise tools were used, but instead if the end result is correct, as complex tasks allow several paths to success. + +**Real life information retrieval** + +[**GAIA**](https://arxiv.org/abs/2311.12983) (2023) kickstarted modern agentic evaluation by requiring models to use a combination of tools, reasoning and retrieval to solve real life queries (sometimes including documents). Questions were split in 3 levels, the first one now saturated and the third one still hard for models. It's also one of these benchs were numbers you find will be spread out against evaluation methods, because people are either reporting on the public validation set or using llm judges to evaluate against the private test set (when there is a public leaderboard [here](https://huggingface.co/spaces/gaia-benchmark/leaderboard)). + +It was later replicated in [BrowseComp](https://cdn.openai.com/pdf/5e10f4ab-d6f7-442e-9508-59515c65e35d/browsecomp.pdf) (2025) which tests the same thing (can a model find the adequate answer to a specific query using tools and online information) but does not guarantee uniqueness of result, as questions were constructed by starting from the result and building a question from it, with varying levels of difficulty: for example, from a specific paper to retrieve, a question will be created by combining information about metadata, for example "which paper about Topic was published at Conference with one Nationality author and two people from Entity?" However, the benchmark is probably also harder at the moment. + +[**GDPval**](https://arxiv.org/abs/2510.04374) (2025) evaluates models on 44 occupations from the “top industries contributing to US GDP", comparing model performance with human performance using model judges. + +Lastly, [GAIA2](https://huggingface.co/blog/gaia2) went beyond simple information retrieval, using a mock up mobile environment to test how assistants are able to answer correctly answer queries relying on chains of events and tool calls. As of now, time sensitive and deliberately noisy subsets (mocking up failing API calls) are the hardest for models, when search and execution seem extremely easy for SOTA models. + +**Science assistants** + +[SciCode](https://arxiv.org/abs/2407.13168) (2024) tests if models can solve real life scientific problems by writing appropriate scientific code, across stem fields (from biology to math/chem/...). Problems are drawn from real life workflows, and each core issue is decomposed in easier subproblems. For the first version, evaluation was done by scientists and a model judge - models were quite bad at it at publication (less than 5% scores) but I'm unsure where up to date results can be found. + +[PaperBench](https://arxiv.org/abs/2504.01848) (2025) similarly tests if models can replicate ML research, but this time with a harder setup: given ICML high quality papers, models must reconstruct the matching code base (8K individually graded tasks have been contributed by the authors of said papers, grouped as rubric trees with weighting for the final grades). Benchmark is evaluated with an LLM judge (though I suspect some of it could be done automatically by constraining a bit the shape of the code asked for). + +[DSBench](https://arxiv.org/pdf/2409.07703) (2025) is a multimodal data analysis benchmark using Kaggle and ModelOff (financial data) samples. From the examples in Appendix it seems that questions from ModelOff are provided in a multiple choice setup, which likely makes the task easier, where the Kaggle tasks each have their own metric. + +[**DABStep**](https://arxiv.org/abs/2506.23719) (2025) evaluates model on previously private (therefore uncontaminated) operational data analysis workloads using real life questions and data. All problems require multi step reasoning and varied document parsing, as well of course as specific data manipulation skills. It's a neat eval because it's hard and replicates actually useful real world use cases, and because each problem has a ground truth, so evaluation is unbiased and not too costly. + +Assistant tasks test integrated capabilities in realistic scenarios, but they're either dynamic and read only, or static in environment which doesn't change. To evaluate adaptability and dynamic decision-making, we need environments that can "surprise" the model. + +#### Game based evaluations +**Game-based** benchmarks are very interesting for several reasons: they usually evaluate adaptability to a changing environment (contrary to most assistant tasks which are static), require long context reasoning, and last but not least, are **understandable** by most people. However, they are not grounded in real life nor necessary reflecting good performance on actually useful use cases. + +The most famous formal evaluation among these is probably [ARC-AGI](https://arcprize.org/arc-agi). The first version (2019) was made of puzzles grids in a sequence, where models had to find the last item of said sequence without explicit rules being provided. This benchmark is to me very reminiscent of logic-oriented IQ tests, and it was almost solved in 2024. A similar benchmark (extrapolation of rules) is [Baba is AI](https://arxiv.org/abs/2407.13729) (2024). The latest version of the bench, ARC-AGI3 (2025, ongoing), is still in development, and contains entire new games (requiring exploration, complex planning, memory management, ...) made specifically for the benchmark. It is still ongoing, and current best solutions on available problems are bruteforcing the games. + +The community and model providers have explored a number of existing games with LLMs. Single player adventure games/RPGs like [TextQuests](https://huggingface.co/blog/textquests) (2025) or [Pokemon](https://github.com/benchflow-ai/benchflow/tree/main/libs/pokemon-gym) (2024) (Twitch for [Claude](https://www.twitch.tv/claudeplayspokemon) and [Gemini](https://www.twitch.tv/gemini_plays_pokemon) for ex) require a combination of very long range planning to get objectives, which require adequante long context memory management, reasoning, and backtracking abilities. Same abilities are needed for single player survival games like [Crafter](https://arxiv.org/abs/2109.06780) (2021, Minecraft inspired). A number of single player game environments have been integrated into the [Balrog](https://arxiv.org/pdf/2411.13543) (2024) benchmark. + +Competitive bluffing games like [Poker](https://arxiv.org/html/2501.08328v1) (2025), Mafia variations like [Town of Salem](https://github.com/summersonnn/Town-Of-Salem-with-LLMs) (2025) and Werewolf (2025, [here](https://arxiv.org/abs/2407.13943)/[there](https://werewolf.foaster.ai/)), or [Among us](antimlabs.com/amongais +) are very interesting to test logic, reasoning, as well as deception abilities. Claude Opus 4 is for example incapable of winning Town of Salem as a vampire (deceptive role) but does well as a peasant (non deceptive role). Cooperative games like [Hanabi](https://arxiv.org/abs/2510.04980) can also be used to test adaptability and communication ability in a constrained environment. + +What's also very neat about these is that they have a single and unambiguous pass/fail metric: did the LLM win the game or not? At the moment, if I were to use these to evaluate models I would probably look at TextQuests for abilities and Town of Salem for safety. + +Beyond testing capabilities in controlled environments, people have explored the ultimate ungameable task: predicting the future. + +#### Forecasters +In the last year, a new category of impossible to contaminate tasks emerged: forecasting. (I guess technically forecasting on the stock markets can be cheated on by some manipulation but hopefully we're not there yet in terms of financial incentives to mess up evals). They should require a combination of reasoning across sources to try to solve questions about not yet occuring events, but it's uncertain that these benchmarks are discriminative enough to have strong value, and they likely reinforce the "slot machine success" vibe of LLMs. (Is the performance on some events close to random because they are impossible to predict or because models are bad at it? In the other direction, if models are able to predict the event correctly, is the question too easy or too formulaic?) + +[FutureBench](https://huggingface.co/blog/futurebench) tests if models can predict future news-worthy events. It uses 2 sources: browsing and an LLM generating questions with a weekly time horizon, and user predictions from betting markets. All data is heavily filtered and cleaned before use. For now, models are barely better than random on human created bets, and succeed 3/4th of the time on model generated questions (likely easier). + +[FutureX](https://arxiv.org/abs/2508.11987) is similar, but uses an array of specific websites (prediction parkets, government websites, general ranking websites and real time data platforms), then uses templates to generate questions about potential future events (`when will STOCK reach POINT?`). 500 questions are generated daily, with filtering of accidentally irrelevant questions. + +A similar approach is used to generate questions in [Arbitrage](https://arxiv.org/pdf/2412.18544), the core difference being the time horizon: events there should be resolved in 2028. + +In a similar vein, you'll also find arenas where LLMs are provided with money to actively trade on financial markets (like Alpha Arena or Trading Agents) - these experiments are less likely to give meaningful results, as, because of their costs, they tend to be run once per model only, so you get no statistical significance there. + + +The landscape of evaluation has evolved with the jumps in capabilities, from testing isolated skills to measuring integrated performance in more realistic scenarios. + +As of Nov 2025, I recommend using: + +- **Core capabilities** (for model builders): Old capabilities evals for training, and for post training AIME26 when it will come out, GPQA, IFEval, SWE-Bench, a long range eval of your choice like HELMET, TauBench or BFCL if you're targetting tool use +- **Core capabilities** (for comparing models at inference): IFBench, HLE, MathArena, AiderBench and LiveCodeBench, MCP-Universe +- **Long horizon tasks** (for real-world performance): GAIA2, DABStep, SciCode, or domain specific evaluations for your use cases +- **Games** (for some extra fun in measuring robustness and adaptability): ARC-AGI3 when it's out, TextQuests, Town of Salem if you're interested in safety, or any other game you like which goes beyond Poker/Chess/Go. + +The field is moving toward evaluations that test capability orchestration rather than isolated skills for actual use. This matches our goal of building models that "work well"—systems that can reliably combine core capabilities, tool use, with a good orchestration to solve actual problems. + + +I hope the field moves towards putting more emphasis on functional testing rather than model judges, and generally understandable datasets and tasks. + + \ No newline at end of file diff --git a/app/src/content/chapters/general-knowledge/model-inference-and-evaluation.mdx b/app/src/content/chapters/general-knowledge/model-inference-and-evaluation.mdx new file mode 100644 index 0000000000000000000000000000000000000000..094ea4e1fd372172684c8da46fa28089d6828179 --- /dev/null +++ b/app/src/content/chapters/general-knowledge/model-inference-and-evaluation.mdx @@ -0,0 +1,205 @@ +--- +title: "Model inference and evaluation" +--- + +import llmTk1 from '../../assets/image/llm_tk_1.png'; +import llmLogprob from '../../assets/image/llm_logprob.png'; +import llmGen from '../../assets/image/llm_gen.png'; +import chatTemplatesTokenisation from '../../assets/image/chat-templates-and-tokenisation.png'; +import Image from '../../../components/Image.astro'; +import Note from "../../../components/Note.astro"; +import Sidenote from "../../../components/Sidenote.astro"; +import Accordion from "../../../components/Accordion.astro"; +import HtmlEmbed from "../../../components/HtmlEmbed.astro"; +import Wide from "../../../components/Wide.astro"; +import Reference from "../../../components/Reference.astro"; + +In this section, we'll look at two steps for models: how input is preprocessed to be given to the model (`tokenization`), and how the model generates a prediction from it (`inference`). + + If you want to learn more about how to actually train a model, you should go read the [Smol Training Guidebook!](https://huggingface.co/spaces/HuggingFaceTB/smol-training-playbook) + + + +### Tokenization +The input text (called a *prompt* at inference) is first split into *tokens*, small units of texts (which can be one or several characters, up to the word level) each associated with a number. The whole range of tokens a model can parse is called its *vocabulary*. + +#### Basics of tokenization: Why and how do we tokenize text? +Since large language models are actually big mathematical functions, they eat numbers, not text. + +Say you want to transform a sentence to numbers. You first need to decide how to cut your sentence into small pieces, then map every small piece to a number; this is *tokenization*. + +In the past, people would try to map each character of a text with its index in a alphabet (`a` -> 1, `b` -> 2, etc) which is called *character based tokenization* (you split between characters). On the other end of the spectrum, people also tried to map each word with its index in a dictionary (`a` -> 1, `aardvark` -> 2, `ab` -> 3, etc) which is called *word based tokenization* (you split on spaces, if your language has spaces - if not, it's a bit harder). + +Both these methods share a strong limitation: they remove information from the input text. They erase semantic connections that you can see from word shape (ex: `dis similar`, `similar`, `similar ity`, `similar ly`), information we would like our model to retain, so it connects related words together. +(Plus, what happens if you suddenly have a completely new word in input? It gets no number, and your model can't process it 😔 ) + +Some people therefore had the idea to cut words into sub-words, and assign index to these sub-words (`dis`, `similar`, `ity`, `ly`)! + +This was initially done using morpho-syntactic rules (*morpho-syntax* is like the grammar of word creation). Now most people use byte pair encoding (BPE), a smart statistical method to create the sub-words automatically depending on their frequency in a reference text. + +So as a summary: tokenization is a way to map small units of texts (which can be one or several characters, up to the word level) to numbers (similar to an index). When you want to process text, your input text (called a *prompt* at inference) is split into these *tokens* by a tokenizer. The whole range of tokens a model or tokenizer can parse is called its *vocabulary*. + + +- ⭐ [Explanation of different tokenization methods in the 🤗 NLP Course](https://huggingface.co/learn/nlp-course/en/chapter2/4) +- ⭐ [Conceptual guide about tokenization in the 🤗 doc](https://huggingface.co/docs/transformers/en/tokenizer_summary) +- [Course by Jurafsky on tokenization (and other things)](https://web.stanford.edu/~jurafsky/slp3/2.pdf) - skip to 2.5 and 2.6 + + + +I would strongly recommend reading a longer explanation on how BPE works, as it's really a base of modern LLMs. + +- ⭐ [Explanation of BPE in the 🤗 NLP Course](https://huggingface.co/learn/nlp-course/en/chapter6/5) +- [BPE Paper (for text, as the method existed before in other fields)](https://aclanthology.org/P16-1162/) + + +When building a tokenizer require making more choices than one would expect. For example, to tokenize numbers, you don't want to use a basic BPE, but do you only index 0 to 9, and assume all other numbers will be compositions of digits? Do you want to store numbers up to, say, one billion, individually? + +Current well known models display a range of approaches to this, but it's unclear what works better to allow mathematical reasoning. This will affect some mathematical evaluation (and is the reason why almost no evaluation is pure arithmetics). + + + +- ⭐ [A nice visual demo by Yennie Jun of how tokenizers of Anthropic, Meta, OpenAI, and Mistral models split numbers](https://www.artfish.ai/p/how-would-you-tokenize-or-break-down) +- [Small history by Beren Millidge of the evolution of number tokenization through the years](https://www.beren.io/2024-05-11-Integer-tokenization-is-now-much-less-insane/) + + +#### How tokenization can mess up your evaluation +**Managing fine-tuned models, system prompts and chat templates** + +Pre-2022, models used to simply be pretrained: text in, text out, nothing else. Then, we got instruction tuning and chat models in 2023, and in 2025 reasoning models. This means that we went from using raw text to using more and more formatting. + + + + + +This means a number of models are going to perform terribly if you do not make sure to: +1. respect the format the model expectes +2. adds a system prompt at the very beginning of inference if your model requires one +3. remove the thinking trace from reasoning models answers before processing them (you can usually regex to remove what's between the `` tags) + + + + + +Different tokenizers behave differently with spacing and special tokens. See this [visualization](https://x.com/danielhanchen/status/1796952220619157694) showing how spacing, tokenization, and templates interact. Never assume tokenizers behave identically! + + +**Paying attention to start and end of sentence tokens** + +Some pretrained models, like the `Gemma` ones, are extremely sensitive to the [inclusion of start of sentence tokens](https://github.com/EleutherAI/lm-evaluation-harness/pull/1465) at inference. You might need to do a couple of experiments to see if that happens for you, and add these tokens manually when evaluating if they are not in your dataset. + +You can also encounter some issues where your model won't stop on an end of sentence token like you would expect. Code models usually have been trained with `\n\t` as a single token. This means that when generating text, they will often generate `\n\t` in one step. A task which defines `\n` as an end of sentence token (= to stop the generation) will let the model continue generating after a `\n\t`, if predicted as one token, since it's not the same as `\n`. But you would actually still want the model to stop. In these cases, you either need to update your end of sentence tokens, or define a mechanism to backtrack on the character representation of the latest tokens to stop (and cut) the generation a posteriori. + +**Multilinguality and tokenization** + +When looking at multilingual evaluations, you'll encounter two issues. + +First, as some languages do not always use spacing as a word separator (Korean, Thai, Japanese, Chinese, to cite a few), they will require language specific tokenizers to be split properly, else it will affect their scores on metrics such as [BLEU](https://github.com/EleutherAI/lm-evaluation-harness/issues/212), F1 scores, etc. + +Then, tokenizers in general might be unfair to non-English languages. When training a BPE tokenizer, you use data from the different languages you want to cover, but most of the time, though, this data is unbalanced between languages (with, for example, an order of magnitude more English than Thai, or Burmese). Since BPE tokenizers create their vocabulary tokens based on the most frequent words seen, most of the long tokens will be English words - and most of the words from the less frequent languages will only be split at the character level. This effect leads to an unfairness in multilingual tokenization: some (less frequent, or *lower-resourced*) languages require orders of magnitude more tokens to generate a sentence of equivalent length as English. + + + + + + + +If you are in this case, the number of tokens that the model is allowed to generate for an evaluation should also be language dependent, as not all languages are tokenized in similar amount of tokens. + + + +- ⭐ [A beautiful breakdown and demo by Yennie Jun on tokenization issues across languages](https://www.artfish.ai/p/all-languages-are-not-created-tokenized): The breakdown in itself is very clear, and the embedded space comes from her work. +- ⭐ [A demo by Aleksandar Petrov on unfairness of tokenization](https://aleksandarpetrov.github.io/tokenization-fairness/): I recommend looking at `Compare tokenization of sentences` to get a feel for the differences in cost of inference depending on languages + + + +### Inference + +Now that we know how to convert our input text into something the LLMs can parse, let's look at how models process this text. + +From this input text, the LLM generates a probability distribution of the most likely next tokens over all the vocabulary. To get a continued generation, we can take the most probable token (give or take some added randomness to get more interesting outputs) as the next one, then repeat the operation, using the new token as the end of the prompt, etc. + + + + + + +**Log-likelihood evaluations**: Given a prompt and one (or several) answers, what is probability of said answer(s) for my model? + +**Generative evaluations**: Given a prompt, what text does my model generate? + +Choice depends on your task (as we'll see below) and on your model: most models under APIs do not return the logprobabilities, so you'll need to use generative evaluations systematically to evaluate them. + + + +#### Log-likelihood evaluations +For log-likelihood evaluations, we want the conditional probability of one or several choices given a prompt - in other terms, what is the likelihood to get a specific continuation given an input? +So: +- we concatenate each choice with the prompt, and pass them to our LLM, which outputs the logits of each token depending on the previous ones +- we only keep the last logits (associated with the choice tokens), and apply a log softmax to get log-probabilities (where the range is `[-inf, 0]` instead of `[0-1]`) +- we then sum all individual tokens log probabilities to get the overall choice log probability +- we can finally apply a normalization based on choice length + + + +This allows us to apply one of the following metrics: +- get the preferred answer of a model among several choice, like in the above picture. (*However, this can advantage scores of models which would have, freely, generated something else, like `Zygote` in the picture.*) +- test if a single choice has a probability above 0.5 +- study model calibration. A well calibrated model is a model for which the correct answers have the highest probabilities. + +To learn more about calibration, you can check [this paper](https://arxiv.org/abs/2207.05221) from Anthropic, on what it is, how to detect it, and how to train models to be well calibrated, and [this paper](https://arxiv.org/abs/2311.14648) on some possible limits of calibration). + + + + +A multiple choice question answer can be expressed as a free form generative evaluation too! For this reason, you'll sometimes see a mention of the task **formulation**. + +There are three common task formulations: +- **Multiple choice format (MCF)**: we compare the likelihood of choices indices, where choices are explicitly presented in the prompt and prefixed with A/B/C/D (as in MMLU) +- **Cloze formulation (CF)**: we compare the likelihood of different choices without providing them in the prompt +- **Freeform generation (FG)**: we evaluate the accuracy of greedy generation for a given prompt + +FG requires substantial latent knowledge and is usually too difficult for models during short pre-training ablations. For this reason, we typically focus on multiple choice formulations (MCF or CF) when running small-scale ablations. However, for post-trained models, FG becomes the primary formulation since we're evaluating whether the model can actually generate useful responses. +However, research has also shown that models struggle with MCF early in training, only learning this skill after extensive training, making CF better for early signal. We thus recommend using CF for small ablations, and integrate MCF in the main run as it gives better mid-training signal once a model has passed a threshold to get sufficiently high signal-over-noise ratio for MCF. +A quick note also that, to score a model's answer in sequence likelihood evaluations like CF, we compute accuracy as the percentage of questions where the the correct answer has the highest log probability normalised by character/token count. This normalisation prevents a bias toward shorter answers. + + +The point at which MMLU MCF becomes non-random depends on the model size and training data. For a 7B transformer, the OLMES paper found the model starts showing non-random performance after 500B tokens. For 1.7B model, we found this happens after 6T tokens in SmolLM2. + + + + + +When looking at multiple choices MCQA evaluation, in general, you want to tokenize the context together with the choices, as it creates a succession of tokens which is likely/natural for the model. + +However, some tokenizers (like the [Llama one](https://github.com/EleutherAI/lm-evaluation-harness/pull/531#issuecomment-1595586257)) do not satisfy `tok(context + choice) = tok(context) + tok(choice)` (and add or remove spacing). This means that comparing the logprobabilities of the choices only is not trivial, as the context tokens can "bleed out" into them, messing up the comparison. + +To give a concrete example, say you have characters `C1`, `C2`, and `C3` as base tokens of your vocabulary, and `C1C2` also happens to be a single token learned during BPE. + +Say your context is C1, and the choices C2 and C3. +If you tokenize the context with the choices, you compare `C1C2` (one token) with `C1+C3` (two tokens). Even if you normalize the logprobs by length, you are not comparing the same thing. +Comparing after tokenizing the context and choices separately means you compare `C1+C2` and `C1+C3`. But since `C1C2` is a token, the occurence of `C1+C2` is likely rare in the data your encoder saw, so it is an unlikely succession for your model, which can mess up your logprobabilities. + +If this is the case for your model, the solution is usually to go for the least worst option, comparing the comparable: compute the tokens of context and choice separately and then concatenate them after removing the special start/end of sentence tokens which might have been added. + + +#### Generative evaluations +For a generative evaluation, we want the text generated by the model given an input prompt. + +It is obtained in an auto-regressive way: we pass the prompt to the model, look at the most likely next token, select it as being the model's "choice first token", then repeat until we reach an end of generation condition (maximum length, special token to stop the generation, etc). All the tokens generated by the model are consider its answer to the prompt. + + + +We can then compare this generation with references and score the distance between both (using either simple metrics like exact match, more complex metrics like BLEU, or models as judges). + + +- ⭐ [Blog on several ways to evaluate MMLU](https://huggingface.co/blog/open-llm-leaderboard-mmlu) , by my team at Hugging Face. I recommend reading it if you want to delve deeper into the differences between multi choice log-likelihood evaluations and generative ones, including what it can mean with respect to score changes (The above illustrations come from the blog and have been made by Thom Wolf) +- ⭐ [A beautiful mathematical formalization of the above inference methods](https://arxiv.org/abs/2405.14782v2), from EleutherAI. Go to the Appendix directly. + + + diff --git a/app/src/content/chapters/general-knowledge/picking-your-evaluation.mdx b/app/src/content/chapters/general-knowledge/picking-your-evaluation.mdx new file mode 100644 index 0000000000000000000000000000000000000000..f53271858a052702589fb4324771a3219098310c --- /dev/null +++ b/app/src/content/chapters/general-knowledge/picking-your-evaluation.mdx @@ -0,0 +1,227 @@ +--- +title: "Picking good automatic evaluations for pretraining" +--- + +import Note from "../../../components/Note.astro"; +import Sidenote from "../../../components/Sidenote.astro"; +import HtmlEmbed from "../../../components/HtmlEmbed.astro"; + +In some cases, you don't want to "just" reproduce existing scores a posteriori, but you actually need to understand how well your model is training while it's happening. Evaluations you need then have different properties than evaluations for the final performance of models, as you need tasks which will provide good signal even when the model is not yet very good. + +So the FineWeb team designed a method to select the best evaluations for pre-training ablations, across 9 languages - let's listen to their wise advice. + +For these languages, we collected and implemented all available tasks that we could find, a total of **185 tasks**. Then, we began task selection with two primary goals: ensuring **evaluation diversity**, and making sure each task provided a **reliable signal** during pre-training. + +For evaluation diversity, we aimed to assess a broad range of model capabilities, including: + +- **Reading comprehension (RC)**: Understanding provided context and answering questions based on it. +- **General knowledge (GK)**: Answering questions about facts from various fields without added context. +- **Natural Language Understanding (NLU)**: Comprehending the semantics of provided input. +- **Common-sense reasoning (RES)**: Demonstrating the ability to perform simple reasoning requiring embodied knowledge. +- **Generative tasks**: Ability to generate text in the target language without the "help" of multiple choice options. + +We consider that tasks provide a reliable signal if they provide a dependable score. This means the score should be above the random baseline, increase as training progresses, show low variability across different seeds, and provide consistent model ranking at each training stepFor similar sized models trained with the same hyperparameters on the same amount of data.. + +To thoroughly examine the signal our tasks provide, we trained many 1.5B parameter models for each language, using 30B tokens from subsets of the supported languages of the five largest openly available multilingual web datasets. These models were trained with the same hyperparameters and tokenizer. We then evaluated them at regular checkpoint intervals on the collected tasks (with no instruction and no system prompt in a 0-shot setting). + +This process required multiple evaluation runs for each task due to iterations on its implementation, resulting in a total of **73 000 GPU hours consumed** 🔥! + +With **49 models trained** we could finally define what a **reliable signal** means to us! + +#### Monotonicity + +One of our core requirements for a task is that it can be learned from training data and this **learning can be gradually observed as the training progresses**. Without this improvement through time, it's uncertain whether there will ever be an improvement in the future. + +To measure this, we used the **Spearman rank correlation** to quantify the correlation between steps and score. Spearman rank correlation can capture monotonicity even when scores don't evolve linearly with the number of steps. We required each task to have at least an average correlation of 0.5 over all model training runs. + + + + +#### Low noise + +When comparing model performance on tasks, we need to consider whether differences are due to **evaluation noise or genuine performance variations**. + +Noise can arise from the stochastic processes involved in model training, such as random token sampling, data shuffling, or model initialization ([Madaan et al., 2024](https://arxiv.org/abs/2406.10229)). To measure how sensitive each task is to this noise, we trained four additional models on our own monolingual corpora (unfiltered CommonCrawl data in each language) using different seeds. + +For each task, we computed: + +1. First, a standard deviation of model scores for every step (approximately every 1B tokens), which we call the **per-step-std**. +2. Then, to obtain a global variability measurement, we averaged all the per-step-std values to get the **avg-std** over the full training. We assume this value is an upper-bound across model architectures and training datasets (as it was approximated by models trained on a "dirtier" dataset, therefore with higher variability). +3. Finally, we computed the **signal-to-noise ratio** (SNR) as the main metric for task variability. We calculate SNR as the mean score at 30B tokens of all runs divided by the avg-std. This metric measures how significant the overall score is relative to the score variations (noise). + +We aimed for each task to have an SNR > 20. The only exception to this rule are generative tasks, which typically have relatively low SNR, but are still worth including as they provide insights into how the model behaves when prompted to generate unconstrained (without answer options). In a multilingual setting, this is particularly relevant as some models trained on multiple languages can exhibit high task scores but then suddenly reply in the wrong language for generative tasks! + + + + +Assuming model performance is normally distributed across different seeds, we want the benchmark-run performance to be at least 3 final-stds above the benchmark random baseline. This would mean that 99.85% of seed scores are above the random baseline (formally, benchmark-run performance - benchmark random baseline > 3 * final-std). + + +#### Non-Random Performance + +Many model capabilities are acquired later in training, thus **many tasks** (especially harder ones, such as math-related ones) **show baseline-level performance for an extended period**. While these tasks are useful, they're not ideal for early pre-training evaluation, and **we did not want to keep them** for this setting. + +We first computed the baseline random performance of the task (as the sum of 1/n_choices for all samples for multiple choice questions, and as zero for generative evaluations). Then we calculated the task's distance from the baseline as the maximum score across all models minus the baseline. + + + + +#### Model Ordering Consistency + +Let's not forget that the main goal of these evaluations is to compare models and datasets! + +In the future, we want to use these evaluations to select the best datasets for full model pretraining. This means **our tasks should rank datasets trained using very few tokens (we typically run data ablations on 30B tokens), in the same order as they would when trained for longer, after significantly more steps.** + +In other words, we would like tasks to have **predictive capability regarding future performance during pre-training**: if pre-training dataset A outperforms pre-training dataset B at 30 billion tokens, we would like this trend to continue at 300 billion tokens. + +Proving this is inherently impossible, but there is a necessary preliminary condition that we can test for: for the results to be consistent at large scales, they must also first show consistency at smaller scales! + +To measure this consistency in task ordering, we computed the average **Kendall's Tau** of models ranking between every two consecutive steps. We only considered steps starting after 15B tokens of pre-training, as we found orderings before the range incredibly noisy. A high value of this metric indicates that the ordering remains consistent as training progresses. + + +We had no strict minimum value requirement for this property, instead using it to establish comparisons between tasks. + + + + + +#### Metrics + +As the targets in CF of multiple choice tasks are choices themselves, each target can have a different number of tokens, characters, and unconditional probability (probability of generating the choice without a context prefix). + +Measuring accuracy without normalization would have the models prefer answers with fewer tokens, for example. + +To account for this, we consider the following accuracy variations: + +- **Accuracy** : + `acc` = $\underset{i}{\arg\max}(ln(P (a_i|q)))$ +- **Accuracy normalized over character length** : + `acc_char` = $\underset{i}{\arg\max}\frac{ln(P (a_i|q))}{num\_characters(a_i)}$ +- **Accuracy normalized over token length** : + `acc_token` = $\underset{i}{\arg\max}\frac{ln(P (a_i|q))}{num\_tokens(a_i)}$ +- **PMI Accuracy** : + `acc_pmi` = $\underset{i}{\arg\max}ln\frac{P (a_i|q)}{P (a_i|u)}$, where $u =$ ''Answer:'' + +Where $a_i$ is the answer choice $i$, $q$ is a question prompt and $P (a_i|q)$ is the probability of having $a_i$ follow $q$. For more details see [Gu et al., 2024](https://arxiv.org/abs/2406.08446) and [Biderman et al., 2024](https://arxiv.org/abs/2405.14782). + +`acc_pmi` metric measures how much more likely a model is to predict A_i if provided with question context compared to if there was no context at all. This can be useful if the correct choice contains generally unlikely tokens, making the model less likely to choose such an answer. + +For our generative tasks on the other hand, we used the following metrics: + +- `prefix_match`: Exact match where only the prefix of the answer must match +- `f1`: F1 score computed over predicted/gold words extracted using a word tokenizer + +For both generative metrics, minor preprocessing is applied to remove articles and punctuation, and lowercase the text. + +Selecting the best evaluation metrics proved to be a **challenging task**. Not only is there no single metric that consistently outperforms the rest, but we often encountered situations where one metric had better monotonicity while another had a higher signal-to-noise ratio. In such cases, we typically made our decision based on the selected metric for tasks' implementation in a different language. We are aware that such hand-picking is often not possible and thus offer the following recommendations: + +➡️ Multichoice Tasks + +- We found **base accuracy** to perform well for tasks with answer options varying subtly (e.g. Yes/No/Also), particularly NLI tasks. In such cases, where the answer options are often each a single token, the base accuracy is advisable to use. +- While OLMES authors ([Gu et al., 2024](https://arxiv.org/abs/2406.08446)) recommends using PMI for tasks with unusual words, we found **PMI** to be highly effective for "difficult" reasoning and knowledge tasks like AGIEVAL or MMLU. In these cases, PMI provided the best results and was often the only metric delivering performance above random. That said, PMI was, on average, the weakest metric across all other tasks, while also being two times more expensive to compute. We therefore only recommend its use for complex reasoning and knowledge tasks. +- The metrics we found to be **most reliable overall** were length normalization metrics (token or character-based). However, the best choice was dependent on language, rather than being consistent for a given task. Due to that, we recommend using the maximum of acc_char and acc_token for the most reliable results.Note that acc_token is heavily tokenizer dependent. On our ablations all models were trained using the same tokenizer. + +➡️ Generative Tasks + +For **generative metrics**, the choice is clearer: we suggest using the F1 score unless exact matching is required, as in math-related tasks. F1 is generally less noisy and more resilient to small changes in the generations. diff --git a/app/src/content/chapters/human-evaluation/using-human-annotators.mdx b/app/src/content/chapters/human-evaluation/using-human-annotators.mdx new file mode 100644 index 0000000000000000000000000000000000000000..5121b5b018e08141dde5c85a158f06250cb8940a --- /dev/null +++ b/app/src/content/chapters/human-evaluation/using-human-annotators.mdx @@ -0,0 +1,81 @@ +--- +title: "Using human annotators" +--- + +import bestAnnotationPractices from '../../assets/image/best_annotation_practices.png'; +import Image from '../../../components/Image.astro'; +import HtmlEmbed from "../../../components/HtmlEmbed.astro"; +import Note from "../../../components/Note.astro"; +import Sidenote from "../../../components/Sidenote.astro"; +import Accordion from "../../../components/Accordion.astro"; + +#### Using human annotators + +I suggest reading Section 3 of this [review](https://aclanthology.org/2024.cl-3.1/) of good practices in data annotation quality. If you want production level quality and have the means to implement all of these methods, go ahead! + + + +However, important guidelines (no matter your project size) are the following, once you defined your task and scoring guidelines. + +- **Workforce selection, and if you can monetary incentive** +You likely want the people working on your task to: +1) obey some demographics. + Some examples: be native speakers of the target language, have a higher education level, be experts in a specific domain, be diverse in their geographical origins, etc. + Your needs will vary depending on your task. +1) produce high quality work. + It's notably important now to add a way to check if answers are LLM-generated, and you'll need to filter some annotators out of your pool. + *Imo, unless you're counting on highly motivated crowdsourced annotators, it's always better to pay your annotators correctly.* + + + +Unless you have highly motivated crowdsourced annotators, always pay fairly. Underpaid annotators produce lower quality work, introduce more errors, and may use LLMs to complete tasks quickly. + + +- **Guideline design** +Make sure to spend a lot of time really brainstorming your guidelines! That's one of the points on which we spent the most time for the [GAIA](https://huggingface.co/gaia-benchmark) dataset. + + + +When creating the [GAIA benchmark](https://huggingface.co/gaia-benchmark), guideline design consumed more time than any other phase. Clear, unambiguous guidelines are worth the investment—they prevent costly re-annotation rounds. + + +- **Iterative annotation** +Be ready to try several rounds of annotations, as your annotators will misunderstand your guidelines (they are more ambiguous than you think)! Generating samples several times will allow your annotators to really converge on what you need. + + - **Quality estimation** and **Manual curation** +You want to control answers (notably via inter-annotator agreement if you can get it) and do a final selection to keep only the highest quality/most relevant answers. + +Specialized tools to build annotated high quality datasets like [Argilla](https://argilla.io/) can also help you. + + +- ⭐ [How to set up your own annotator platform in a couple minutes](https://huggingface.co/learn/cookbook/enterprise_cookbook_argilla), by Moritz Laurer. A good read to get some hands on experience using open source tools (like Argilla and Hugging Face), and understanding better the dos and don'ts of human annotation at scale. +- ⭐ [A guide on annotation good practices](https://aclanthology.org/2024.cl-3.1/). It's a review of all papers about human annotation dating from 2023, and it is very complete. Slightly dense, but very understandable. +- [Another guide on annotation good practices](https://scale.com/guides/data-labeling-annotation-guide), by ScaleAI, specialised in human evaluations. Its a more lightweigth complement to the above document. +- [Assumptions and Challenges of Capturing Human Labels](https://aclanthology.org/2024.naacl-long.126/) is a paper on how to look at source of annotator disagreement and mitigate them in practice + + + +Here are a few practical tips you might want consider when using human annotators to build an evaluation dataset. + + +**Designing the task** + +- **Simple is better**: Annotation tasks can get unnecessarily complex, so keep it as simple as possible. Keeping the cognitive load of the annotators to a minimum will help you ensure that they stay focused and make annotations of a higher quality. +- **Check what you show**: Only show the necessary information for annotators to complete the task and make sure you don't include anything that could introduce extra bias. +- **Consider your annotators time**: Where and how things are displayed can introduce extra work or cognitive load and therefore negatively impact in the quality of results. For example, make sure that the texts and the task are visible together and avoid unnecessary scrolling. If you combine tasks and the result of one informs the other, you can display them sequentially. Think about how everything is displayed in your annotation tool and see if there's any way you can simplify even more. +- **Test the setup**: Once you have your task designed and some guidelines in place, make sure you test it yourself on a few samples before involving the whole team, and iterate as needed. + +**During the annotation** + +- **Annotators should work independently**: It's better if annotators don't help each other or see each other's work during the task, as they can propagate their own biases and cause annotation drift. Alignment should always happen through comprehensive guidelines. You may want to train any new team members first on a separate dataset and/or use inter-annotator agreement metrics to make sure the team is aligned. +- **Consistency is key**: If you make important changes to your guidelines (e.g., changed a definition or instruction, or have added/removed labels), consider if you need to iterate over the annotated data. At least, you should track the changes in your dataset through a metadata value like `guidelines-v1`. + +**Hybrid human-machine annotation** + +Sometimes teams face contraints on time and resources but don't want to sacrifice on the pros of human evaluation. In these cases, you may use the help of models to make the task more efficient. + +- **Model-aided annotation**: You may use the predictions or generations of a model as pre-annotations, so that the annotation team doesn't need to start from scratch. Just note that this could introduce the model's biases into human annotations, and that if the model's accuracy is poor it may increase work for annotators. +- **Supervise model as a judge**: You can combine the power of the model as a judge methodology (see the section on "Model as a judge") and human supervisors who validate or discard the results. Note that the biases discussed in the "Pros and cons of human evaluation" will apply here. +- **Idenfity edge cases**: For an even faster task, use a jury of models and then have your human supervisor(s) step in where models disagree or there's a tie to break. Again, be aware of the biases discussed in the "Pros and cons of human evaluation". + + \ No newline at end of file diff --git a/app/src/content/chapters/intro.mdx b/app/src/content/chapters/intro.mdx new file mode 100644 index 0000000000000000000000000000000000000000..11734be36d41e676b7dfba9f134b757fcc7ee036 --- /dev/null +++ b/app/src/content/chapters/intro.mdx @@ -0,0 +1,73 @@ +--- +title: "Intro" +--- + +import HtmlEmbed from "../../components/HtmlEmbed.astro"; +import Note from "../../components/Note.astro"; +import Sidenote from "../../components/Sidenote.astro"; +import Quote from "../../components/Quote.astro"; + +## What is model evaluation about? + +As you navigate the world of LLMs — whether you're training or fine-tuning your own models, selecting one for your application, or trying to understand the state of the field — there is one question you have likely stumbled upon: + + +How can one know if a model is *good*? + + +The answer is (surprisingly given the blog topic) evaluation! It's everywhere: leaderboards ranking models, benchmarks claiming to measure *reasoning*, *knowledge*, *coding abilities* or *math performance*, papers announcing new state-of-the-art results... + +But what is evaluation, really? And what can it really tell you? + +This guide is here to help you understand it all: what evaluation can and cannot do, when to trust different approaches (what their limitations and biases are too!), how to select benchmarks when evaluating a model (and which ones are relevant in 2025), and how to design your own evaluation, if you so want. + +Through the guide, we'll also highlight common pitfalls, tips and tricks from the Open Evals team, and hopefully help you learn how to think critically about the claims made from evaluation results. + +In this guide, we focus on evaluations for language (mostly natural language), but many principles also apply to other modalities + +Before we dive into the details, let's quickly look at why people do evaluation, as who you are and what you are working on will determine which evaluations you need to use. + +### The model builder perspective: Am I building a strong model? + +If you are a researcher or engineer creating a new model, your goal is likely to build a strong model that performs well on a set of tasks. For a base model (training from scratch), you want the model to do well on a general tasks, measuring a variety of different capabilities. If you are post-training a base model for a specific use case, you probably care more about the performance on that specific task. The way you measure performance, in either case, is through evaluations. + +As you experiment with different architectures, data mixtures, and training recipes, you want to make sure that your changes (choosing different training data, architecture, parameters, etc) have not "broken" the expected performance for a model of these properties, and possibly even improved it. The way you test for the impact of different design choices is through **ablations**: an ablation is an experiment where you typically train a model under a specific setup, evaluate it on your chosen set of tasks, and compare the results to a baseline model. +Therefore, the choice of evaluation tasks is critical for ablations, as they determine what you will be optimizing for as you create your model. + + + +For base models, one would typically resort to selecting standard benchmark tasks used by other model builders (think the classic list of benchmarks that are always reported when a new model is released - we'll have a look at those below). For a specific use case, you can either use existing evaluation tasks if they are available -- and you likely will want to take a good look if they are not "standard" -- or design your own (discussed below). As you will likely run a lot of ablations, you want the evaluation tasks to provide strong enough signal (and not just meaningless noisy results) and you want them to run cheaply and quickly, so that you can iterate fast. +Through ablations, we are also able to predict the performance of bigger models based on the perfomance on smaller ones, using scaling laws. + +Besides ablations for experiments, you will likely also want to run evaluations on intermediate checkpoints as your model is training, to ensure it is properly learning and improving at the different tasks, and does not start regressing due to spikes or other issues. Finally, you want to evaluate the final checkpoint so that you can announce that your model is SOTA when you release it. + + +Despite often grandiose claims, for any complex capability, we cannot at the moment just say "this model is the best at this", but should instead say **"this model is the best on these samples for this specific task that we hope are a good proxy for this capability, without any guarantee"**. + + +(You can still claim you are SOTA, just keep the caveat in mind.) + +### The model user perspective: Which model is the best on \? + +You want to use a model someone else trained for your specific use case, without performing additional training, or maybe you will perform additional training and are looking for the best existing model to use as a base. + +For common topics like math, code, or knowledge, there are likely several leaderboards comparing and ranking models using different datasets, and you usually just have to test the top contenders to find the best model for you (if they are not working for you, it's unlikely the next best models will work). + +You could want to run the evaluation and comparisons yourself (by reusing existing benchmarks) to get more details to analyse on the model successes and failures, which we will cover below. + + +In [their paper](https://arxiv.org/pdf/2404.02112) about lessons learned on benchmarking and dataset design from the ImageNet era, the authors argue that, since scores are susceptible to instability, the only robust way to evaluate models is through rankings, and more specifically by finding broad groups of evaluations which provide consistent and stable rankings. I believe looking for ranking stability is indeed an extremely interesting approach to model benchmarking, as we have shown that LLMs *scores* on automated benchmarks are extremely susceptible to [minute changes in prompting](https://huggingface.co/blog/evaluation-structured-outputs), and that human evaluations are not more consistent - where *rankings* are actually more stable when using robust evaluation methods. + + +Similarly to model builders hillclimbing a specific capability, for less common topics, you might need to think about designing your own evaluations, which is detailed in our last section. + + +- Model builder: You need fast, high-signal benchmarks that cover the domains/capabilities you care about and can be run repeatedly during ablations. +- Model user: You need benchmarks that match your specific use case, even if that means creating custom ones. + + + +We are strongly missing any kind of good definitions and framework on what intelligence is for machine learning models, and how to evaluate it (though some people have tried, for example [Chollet](https://arxiv.org/abs/1911.01547) in 2019 and [Hendrycks et al](https://www.agidefinition.ai/paper.pdf) this year). Difficulty in defining intelligence is not a problem specific to machine learning! In human and animal studies, it is also quite hard to define, and metrics which try to provide precise scores (IQ and EQ for example) are hotly debated and controversial, with reason. + +There are, however, some issues with focusing on intelligence as a target. 1) Intelligence tends to end up being a moving target, as any time we reach a capability which was thought to be human specific, we redefine the term. 2) Our current frameworks are made with the human (or animal) in mind, and will most likely not transfer well to models, as the underlying behaviors and assumptions are not the same. 3) It is kind of a useless target too - we should target making models good at specific, well defined, purposeful and useful tasks (think accounting, reporting, etc) instead of aiming for AGI for the sake of it. + \ No newline at end of file diff --git a/app/src/content/chapters/troubleshooting/troubleshooting-reproducibility.mdx b/app/src/content/chapters/troubleshooting/troubleshooting-reproducibility.mdx new file mode 100644 index 0000000000000000000000000000000000000000..e7644357cda2b66242da764f1f47899155a00c04 --- /dev/null +++ b/app/src/content/chapters/troubleshooting/troubleshooting-reproducibility.mdx @@ -0,0 +1,93 @@ +--- +title: "Troubleshooting reproducibility" +--- + +import Note from "../../../components/Note.astro"; +import Sidenote from "../../../components/Sidenote.astro"; +import HtmlEmbed from "../../../components/HtmlEmbed.astro"; + +Let's say you have read a recent tech report about a cool new model, and you want to reproduce their results on your machine... but you're not managing to? +Let's explore why. + +#### Different code base +To reproduce evaluation scores to the decimal point, you first need to make sure you're using exactly the same code base as the paper you want to reproduce. + +Usually, this means either using the evaluation default code as provided by the authors, or a standard implementation in a reference library like Eleuther's AI `lm_eval` or HuggingFace's `lighteval`. However, if the code source for evaluation is not provided, then, I'm sorry for you but it's unlikely that you'll be able to reproduce the results precisely. + +If you want to easily understand what kind of discrepancies happen when using different implementations, you can explore [this blog](https://huggingface.co/blog/open-llm-leaderboard-mmlu) (⭐) we wrote with the eval team at HuggingFace. It studies the differences we observed between 3 common implementations of the MMLU evaluation (in `lm_eval`, `helm`, and in the original author implementation), and how they change model scores. + +#### Subtle implementation or loading difference +We've observed that the following were easy things to mess up, even when using the same code base: +- **Different random seeds.** + - Normally, inference is less affected by random seeds than training. However, they can still affect some CUDA operations (see the PyTorch page on [reproducibility](https://pytorch.org/docs/stable/notes/randomness.html)) and change predictions if you're using a non greedy generation strategy. They can also affect the prompt if you're using few-shots, and some pre or post-processing functions. + -> A tiny change can result in a couple of points of difference. +- **Actually different metrics**. + Metrics can be different in practice even if they share the same name. Some examples: + - If the original implementation is a *log likelihood* `exact match` (computing the log probabilities of different possible answers), and you're using a *generative* `exact match` (only comparing the main greedy generation with the reference), you won't get the same scores. + - We also saw, in evaluation code bases, a number of tasks which were defined as `exact match`, but were actually `prefix exact match` (comparing only the beginning of the generation with the reference), or `suffix exact match` (the opposite), or `quasi exact match` (exact match with a normalization). + -> You therefore can't rely only on the metric name to determine what is happening, and need to look at the code. +- **Different normalization**. + - To go back to our above `exact match` comparison example, in `lm_eval` v1, a number of tasks were simply named generative `exact match`: you would assume from this that the prediction is *compared as such* to a reference. + Looking at the code, the prediction would instead go through a normalization step (removing punctuation, homogenizing numbers, etc) before being compared to the reference. This will obviously change results quite a lot. + (The `lm_eval` v2 now includes the normalization name in most metric names.) + -> This is one of the easiest things to mess up, especially for tasks which require a lot of normalization/answer post processing, like math evaluations (where you want to extract the answer from a generated explanation). + + + +**Four factors that change results even with identical code:** + +- **Hardware**: PyTorch doesn't guarantee reproducibility across different GPUs/hardware +- **Inference library**: transformers, vllm and sglang handle batching and matrix operations slightly differently as of 2025 +- **Batch size**: Different batch sizes = different results (you should fix the batch size for reproducibility, though careful about OOM errors) +- **Loading precision**: Lower precision (especially quantized models vs floating point models) will change numerical results + + + +#### Different prompt +3 main things can come into play for prompt variation. + +**Prompt itself** + +The format you are using for the prompt can and will change scores wildly. + +For example, for multichoice question answers, common formats include very simple variations (e.g. using `A` vs `A.` vs `A)` to introduce choices), which, while **semantically equivalent** (as they contain the exact same content) can still result in difference of *several points for the same model*. + + + +We did some experiments on this (you'll see up to a 7 points difference for the same model on the semantically equivalent prompts, the 5 rightmost columns), and a A [paper observed similar results](https://arxiv.org/abs/2310.11324). + +**Other example**: Llama 3.1 models predicted correct MATH-Hard answers but scored poorly on the Open LLM Leaderboard, because they overfit to GSM8K's prompt format and couldn't adapt to the new one for this eval, despite it being provided in few shot examples. + +*Evaluation on MMLU subsets, acc_norm score (seed 0), in 5-shot.* + + + +This [great paper](https://arxiv.org/abs/2407.07890)⭐ also highlights a side effect of this: a number of models are now trained to overfit benchmark prompts and answer formats, to the cost of adaptation to other prompts at evaluation time. + + +Some tasks are also prefixed with a task prompt (eg: `The following questions are about `) - its presence or absence will also affect the scores. + +**System prompt and chat template** + +Chat models usually have been through instruction/preference training or fine-tuning. During this stage, they have learned to follow specific templates when inferring. For example, templates can require starting rounds of dialogue with a general prompt (called the `system prompt`) prefixed by specific tokens (usually `System: `). Said prompt is here to provide high-level instructions for the model, such as the contents of a persona, or general answering style instructions. Rounds of dialogue can also require adding prefix key words to text, such as `User` for queries and `Assistant` for answers. + +When using few shot, you also need to select if you want examples to be provided multi-turn (mimicking user/assistant turns) or all at once (in a single user prompt). + +Not following the chat template expected by the model at inference will kill its performance, as it will drive its output outside of the probability space it's been converging on. + +Similarly, if you are using a reasoning model, you need to make sure whether you are comparing with or without thinking enabled. + +**Few-shots samples** + +Two things are easy to mess up with few-shot samples: the number of few-shot examples, which ones you are using, and their specific ordering + + +The importance of using the same examples is not too surprising, if we assume some samples are better at expressing the task than others. More surprising maybe: you not only need to use the exact same samples, but also present them in the **exact same order**. Varying the order on the same samples led us to observe up to 3 points of difference on some subsets of MMLU (you can see [some results here](https://huggingface.co/blog/evaluation-structured-outputs) , it's the third colorgrid) + + +This is also a place where paying attention to the random seeds is important. + +**Parameters** + +For generative evaluations, parameters to pay attention to are making sure you are 1) using the **same end of sentence token** (you probably should not be using a default one for chat and reasoning models); 2) allowing your model to **generate the same number of tokens** for the evaluation (this is particularly crucial for reasoning models, which require a huge numbers of tokens in thinking mode); 3) if using sampling, that you are using the **same seed/temperature parameters**. + diff --git a/app/src/content/embeds/arxiv/arxiv.html b/app/src/content/embeds/arxiv/arxiv.html new file mode 100644 index 0000000000000000000000000000000000000000..344a717512bb3837168a74d2a90e2c249a8e406d --- /dev/null +++ b/app/src/content/embeds/arxiv/arxiv.html @@ -0,0 +1,566 @@ +
        + + + + \ No newline at end of file diff --git a/app/src/content/embeds/arxiv/fetch_arxiv_api.py b/app/src/content/embeds/arxiv/fetch_arxiv_api.py new file mode 100644 index 0000000000000000000000000000000000000000..18b3a9f8a651ae9e5313a02b2356e0c87db75d90 --- /dev/null +++ b/app/src/content/embeds/arxiv/fetch_arxiv_api.py @@ -0,0 +1,270 @@ +#!/usr/bin/env python3 +""" +Script to retrieve papers from the arXiv API +Optimized for natural representation of scientific domains +""" + +import requests +import xml.etree.ElementTree as ET +import json +import time +import os +from urllib.parse import quote +from datetime import datetime, timedelta +from collections import Counter +import random + +class ArxivFetcher: + def __init__(self): + self.base_url = "http://export.arxiv.org/api/query" + self.delay = 3 # Delay between requests (respecting API limits) + + def fetch_by_category(self, categories, max_per_category=500, total_max=15000): + """Retrieve papers by category with global limit""" + print(f"🔍 Retrieval by category (max {max_per_category} per cat, {total_max} total)") + + all_papers = [] + + for i, category in enumerate(categories): + if len(all_papers) >= total_max: + break + + print(f" [{i+1}/{len(categories)}] {category}...") + + # Dynamic calculation of number to retrieve + remaining = total_max - len(all_papers) + fetch_count = min(max_per_category, remaining) + + papers = self._fetch_category(category, fetch_count) + all_papers.extend(papers) + + print(f" ✅ {len(papers)} papers retrieved (total: {len(all_papers)})") + + # Delay between categories + if i < len(categories) - 1: + time.sleep(self.delay) + + return all_papers[:total_max] + + def fetch_recent_papers(self, days_back=30, max_results=15000): + """Retrieve recent papers from the last days""" + print(f"📅 Retrieving papers from the last {days_back} days") + + # End date: today + end_date = datetime.now() + # Start date: X days ago + start_date = end_date - timedelta(days=days_back) + + # Format arXiv: YYYYMMDDHHMM + date_query = f"submittedDate:[{start_date.strftime('%Y%m%d%H%M')} TO {end_date.strftime('%Y%m%d%H%M')}]" + + return self._fetch_with_query(date_query, max_results) + + def _fetch_category(self, category, max_results): + """Retrieve papers from a specific category""" + query = f"cat:{category}" + return self._fetch_with_query(query, max_results) + + def _fetch_with_query(self, query, max_results): + """Generic method to retrieve with a query""" + papers = [] + start = 0 + batch_size = min(1000, max_results) # arXiv limits to 1000 per request + + while len(papers) < max_results: + remaining = max_results - len(papers) + current_batch = min(batch_size, remaining) + + params = { + 'search_query': query, + 'start': start, + 'max_results': current_batch, + 'sortBy': 'submittedDate', + 'sortOrder': 'descending' + } + + try: + response = requests.get(self.base_url, params=params, timeout=30) + response.raise_for_status() + + batch_papers = self._parse_response(response.text) + if not batch_papers: + print(f" ⚠️ No results for start={start}") + break + + papers.extend(batch_papers) + start += len(batch_papers) + + print(f" 📄 Batch {len(batch_papers)} papers (total: {len(papers)})") + + # Delay between requests + time.sleep(self.delay) + + except Exception as e: + print(f" ❌ Error: {e}") + break + + return papers[:max_results] + + def _parse_response(self, xml_content): + """Parse arXiv XML response""" + papers = [] + + try: + root = ET.fromstring(xml_content) + + # arXiv Namespace + ns = {'atom': 'http://www.w3.org/2005/Atom', + 'arxiv': 'http://arxiv.org/schemas/atom'} + + entries = root.findall('atom:entry', ns) + + for entry in entries: + try: + # ID arXiv + arxiv_id = entry.find('atom:id', ns).text.split('/')[-1] + + # Titre + title = entry.find('atom:title', ns).text.strip() + title = ' '.join(title.split()) # Clean spaces + + # Résumé + summary = entry.find('atom:summary', ns).text.strip() + summary = ' '.join(summary.split())[:500] # Limit size + + # Auteurs + authors = [] + for author in entry.findall('atom:author', ns): + name = author.find('atom:name', ns) + if name is not None: + authors.append(name.text.strip()) + + # Catégories + categories = [] + primary_category = None + + for category in entry.findall('atom:category', ns): + term = category.get('term') + if term: + categories.append(term) + + # Primary category + primary_cat = entry.find('arxiv:primary_category', ns) + if primary_cat is not None: + primary_category = primary_cat.get('term') + elif categories: + primary_category = categories[0] + + # Publication date + published = entry.find('atom:published', ns) + published_date = published.text if published is not None else None + + paper = { + 'id': arxiv_id, + 'title': title, + 'summary': summary, + 'authors': authors, + 'categories': categories, + 'primary_category': primary_category, + 'published': published_date + } + + papers.append(paper) + + except Exception as e: + print(f" ⚠️ Error parsing entry: {e}") + continue + + except ET.ParseError as e: + print(f"❌ XML parsing error: {e}") + + return papers + +def save_papers(papers, filename): + """Save papers to JSON""" + with open(filename, 'w', encoding='utf-8') as f: + json.dump(papers, f, indent=2, ensure_ascii=False) + + size_mb = os.path.getsize(filename) / 1024 / 1024 + print(f"💾 Saved: {filename} ({len(papers)} papers, {size_mb:.1f} MB)") + +def main(): + """Main arXiv data retrieval""" + print("🚀 ArXiv Data Fetcher - Version Optimisée") + print("=" * 50) + + fetcher = ArxivFetcher() + + # Simple approach: 1 month of recent data + print("\n📅 SIMPLE APPROACH: 1 month of recent data") + print("🎯 Objective: retrieve everything available from the last month") + print("⚡ Without representativeness constraint - just natural data") + + # Try with different periods to find data + monthly_papers = None + for days in [30, 60, 90, 120]: # 1, 2, 3, 4 months + print(f"\n🔍 Attempt: {days} days...") + monthly_papers = fetcher.fetch_recent_papers(days_back=days, max_results=15000) + if monthly_papers and len(monthly_papers) > 1000: + print(f"✅ {len(monthly_papers)} papers found over {days} days") + break + elif monthly_papers: + print(f"⚠️ Only {len(monthly_papers)} papers over {days} days") + else: + print(f"❌ No papers found over {days} days") + + if not monthly_papers: + print("\n🔄 Fallback: retrieval by popular categories") + # If no recent data, just take popular categories + popular_categories = [ + 'cs.LG', 'cs.AI', 'cs.CV', 'cs.CL', 'cs.CR', 'cs.RO', 'cs.HC', + 'physics.comp-ph', 'physics.data-an', 'physics.optics', + 'math.ST', 'math.NA', 'math.OC', 'math.PR', + 'stat.ML', 'stat.ME', 'stat.AP', + 'eess.AS', 'eess.IV', 'eess.SP', + 'q-bio.QM', 'q-bio.BM', 'astro-ph.CO' + ] + + monthly_papers = fetcher.fetch_by_category( + categories=popular_categories, + max_per_category=500, + total_max=15000 + ) + + if monthly_papers: + save_papers(monthly_papers, "arxiv_monthly_papers.json") + + # Statistiques finales + from collections import Counter + + # Check paper structure + sample_keys = list(monthly_papers[0].keys()) if monthly_papers else [] + category_key = 'primary_category' if 'primary_category' in sample_keys else 'categories' + + domains = [] + for paper in monthly_papers: + if category_key in paper: + cat = paper[category_key] + if isinstance(cat, list) and cat: + domains.append(cat[0].split('.')[0]) + elif isinstance(cat, str): + domains.append(cat.split('.')[0]) + + domain_counts = Counter(domains) + + print(f"\n📊 Natural distribution ({len(monthly_papers)} papers):") + for domain, count in domain_counts.most_common(): + percentage = count / len(monthly_papers) * 100 + print(f" {domain}: {count} papers ({percentage:.1f}%)") + else: + print("❌ Complete retrieval failure") + + print("\n🎉 Retrieval completed!") + print("📁 Files created:") + for filename in ["arxiv_monthly_papers.json"]: + if os.path.exists(filename): + size = os.path.getsize(filename) / 1024 / 1024 # MB + print(f" - {filename} ({size:.1f} MB)") + +if __name__ == "__main__": + main() \ No newline at end of file diff --git a/app/src/content/embeds/arxiv/generate_umap.py b/app/src/content/embeds/arxiv/generate_umap.py new file mode 100644 index 0000000000000000000000000000000000000000..23e43a76394a43d62abaac029f8d59776b59fd67 --- /dev/null +++ b/app/src/content/embeds/arxiv/generate_umap.py @@ -0,0 +1,329 @@ +#!/usr/bin/env python3 +""" +UMAP Generator for arXiv papers +Creates 2D and 3D projections with density-weighted centroids +""" + +import json +import numpy as np +import pandas as pd +from sklearn.feature_extraction.text import TfidfVectorizer +from sklearn.decomposition import TruncatedSVD +import umap +import os +import shutil +from datetime import datetime +from collections import Counter + +def load_papers(filename="arxiv_monthly_papers.json"): + """Load papers from JSON file""" + if not os.path.exists(filename): + print(f"❌ File {filename} not found!") + print("💡 Run fetch_arxiv_api.py first") + return None + + with open(filename, 'r', encoding='utf-8') as f: + papers = json.load(f) + + print(f"📚 {len(papers)} papers loaded from {filename}") + return papers + +def preprocess_papers(papers, sample_rate=5): + """Preprocess papers and sample if necessary""" + print(f"🔄 Preprocessing papers...") + + # Filter papers with missing data + valid_papers = [] + for paper in papers: + if (paper.get('title') and + paper.get('summary') and + paper.get('primary_category')): + valid_papers.append(paper) + + print(f"✅ {len(valid_papers)} valid papers after filtering") + + # Sampling for performance (1 out of N) + if sample_rate > 1: + sampled_papers = valid_papers[::sample_rate] + print(f"📊 Sampling 1/{sample_rate}: {len(sampled_papers)} papers retained") + return sampled_papers + + return valid_papers + +def create_embeddings(papers, max_features=5000, n_components=50): + """Create TF-IDF + SVD embeddings of papers""" + print(f"🔢 Creating embeddings (max_features={max_features}, n_components={n_components})") + + # Combine title and summary + texts = [] + for paper in papers: + title = paper.get('title', '').strip() + summary = paper.get('summary', '').strip() + combined = f"{title} {summary}" + texts.append(combined) + + # TF-IDF + print(" 📝 TF-IDF vectorization...") + tfidf = TfidfVectorizer( + max_features=max_features, + stop_words='english', + ngram_range=(1, 2), + min_df=2, + max_df=0.95 + ) + + tfidf_matrix = tfidf.fit_transform(texts) + print(f" ✅ TF-IDF: {tfidf_matrix.shape}") + + # Dimensionality reduction with SVD + print(f" 🔄 SVD reduction to {n_components} dimensions...") + svd = TruncatedSVD(n_components=n_components, random_state=42) + embeddings = svd.fit_transform(tfidf_matrix) + + print(f" ✅ Final embeddings: {embeddings.shape}") + print(f" 📊 Explained variance: {svd.explained_variance_ratio_.sum():.3f}") + + return embeddings + +def map_to_families(papers): + """Map categories to 9 main scientific families""" + + # Mapping to 9 scientific families + domain_to_family = { + 'cs': 'Computer Science', + 'math': 'Mathematics', + 'physics': 'Physics', + 'stat': 'Statistics', + 'q-bio': 'Biology', + 'eess': 'Engineering', + 'astro-ph': 'Astrophysics', + 'cond-mat': 'Condensed Matter', + 'nucl': 'Nuclear Physics' + } + + families = [] + for paper in papers: + primary_cat = paper.get('primary_category', '') + if primary_cat: + domain = primary_cat.split('.')[0] + family = domain_to_family.get(domain, 'Other') + else: + family = 'Other' + families.append(family) + + family_counts = Counter(families) + print(f"📊 Distribution by family:") + for family, count in family_counts.most_common(): + print(f" {family}: {count} papers") + + return families + +def generate_umap_projection(embeddings, families, n_neighbors=50, min_dist=0.1, spread=0.5, n_components=2): + """Generate UMAP projection""" + print(f"🎯 UMAP projection (n_neighbors={n_neighbors}, min_dist={min_dist}, spread={spread}, n_components={n_components})") + + # Configuration UMAP + reducer = umap.UMAP( + n_neighbors=n_neighbors, + min_dist=min_dist, + spread=spread, + n_components=n_components, + random_state=42, + metric='cosine' + ) + + # Projection + projection = reducer.fit_transform(embeddings) + print(f"✅ Projection UMAP: {projection.shape}") + + return projection + +def calculate_density_weighted_centroids(projection, families, families_list): + """Calculate density-weighted centroids""" + print("🎯 Calculating density-weighted centroids...") + + centroids = {} + + for family in families_list: + # Points of this family + family_mask = np.array(families) == family + family_points = projection[family_mask] + + if len(family_points) < 30: # Filter families too small + continue + + if projection.shape[1] == 2: # 2D + # Calculate 2D density + densities = [] + for point in family_points: + distances = np.linalg.norm(family_points - point, axis=1) + density = np.sum(distances < np.percentile(distances, 20)) # Local density + densities.append(density) + + densities = np.array(densities) + weights = densities / densities.sum() + + # Weighted centroid + centroid_x = np.sum(family_points[:, 0] * weights) + centroid_y = np.sum(family_points[:, 1] * weights) + + centroids[family] = { + 'x': float(centroid_x), + 'y': float(centroid_y), + 'count': len(family_points) + } + + else: # 3D + # Calculate 3D density + densities = [] + for point in family_points: + distances = np.linalg.norm(family_points - point, axis=1) + density = np.sum(distances < np.percentile(distances, 20)) + densities.append(density) + + densities = np.array(densities) + weights = densities / densities.sum() + + # Weighted centroid + centroid_x = np.sum(family_points[:, 0] * weights) + centroid_y = np.sum(family_points[:, 1] * weights) + centroid_z = np.sum(family_points[:, 2] * weights) + + centroids[family] = { + 'x': float(centroid_x), + 'y': float(centroid_y), + 'z': float(centroid_z), + 'count': len(family_points) + } + + print(f"✅ {len(centroids)} centroids calculated") + return centroids + +def save_visualization_data(papers, projection, families, centroids, output_prefix): + """Save visualization data""" + + # Prepare data + viz_data = [] + for i, paper in enumerate(papers): + if projection.shape[1] == 2: # 2D + point = { + 'id': paper.get('id', f'paper_{i}'), + 'title': paper.get('title', ''), + 'summary': paper.get('summary', '')[:200] + '...', + 'authors': ', '.join(paper.get('authors', [])[:3]), # Max 3 authors + 'category': paper.get('primary_category', ''), + 'family': families[i], + 'x': float(projection[i, 0]), + 'y': float(projection[i, 1]) + } + else: # 3D + point = { + 'id': paper.get('id', f'paper_{i}'), + 'title': paper.get('title', ''), + 'summary': paper.get('summary', '')[:200] + '...', + 'authors': ', '.join(paper.get('authors', [])[:3]), + 'category': paper.get('primary_category', ''), + 'family': families[i], + 'x': float(projection[i, 0]), + 'y': float(projection[i, 1]), + 'z': float(projection[i, 2]) + } + viz_data.append(point) + + # Add centroids + viz_data_with_centroids = { + 'points': viz_data, + 'centroids': centroids, + 'metadata': { + 'total_papers': len(papers), + 'dimensions': projection.shape[1], + 'families': list(set(families)), + 'generated': datetime.now().isoformat() + } + } + + # Save + output_file = f"{output_prefix}.json" + with open(output_file, 'w', encoding='utf-8') as f: + json.dump(viz_data_with_centroids, f, indent=2, ensure_ascii=False) + + size_mb = os.path.getsize(output_file) / 1024 / 1024 + print(f"💾 Data saved: {output_file} ({size_mb:.1f} MB)") + + return output_file + +def main(): + """Main UMAP generation pipeline""" + print("🚀 ArXiv UMAP Generator") + print("=" * 40) + + # 1. Data loading + papers = load_papers() + if not papers: + return + + # 2. Preprocessing + papers = preprocess_papers(papers, sample_rate=5) # 1 point out of 5 + + # 3. Mapping to families + families = map_to_families(papers) + families_list = list(set(families)) + + # 4. Embedding creation + embeddings = create_embeddings(papers, max_features=3000, n_components=50) + + # 5. UMAP projection generation + + # UMAP 2D + print("\n🎯 Generating 2D UMAP...") + projection_2d = generate_umap_projection( + embeddings, families, + n_neighbors=50, min_dist=0.8, spread=1.0, n_components=2 + ) + + centroids_2d = calculate_density_weighted_centroids(projection_2d, families, families_list) + + timestamp = datetime.now().strftime("%Y%m%d_%H%M%S") + output_2d = save_visualization_data( + papers, projection_2d, families, centroids_2d, + f"arxiv_umap_viz_2d_{timestamp}" + ) + + # UMAP 3D + print("\n🎯 Generating 3D UMAP...") + projection_3d = generate_umap_projection( + embeddings, families, + n_neighbors=50, min_dist=0.8, spread=1.0, n_components=3 + ) + + centroids_3d = calculate_density_weighted_centroids(projection_3d, families, families_list) + + output_3d = save_visualization_data( + papers, projection_3d, families, centroids_3d, + f"arxiv_umap_viz_3d_{timestamp}" + ) + + # Automatic copy to content/assets/data + import shutil + source_file = output_2d # Use 2D by default + target_dir = "../../assets/data" + target_file = os.path.join(target_dir, "data.json") + + try: + # Create directory if necessary + os.makedirs(target_dir, exist_ok=True) + shutil.copy2(source_file, target_file) + print(f"\n✅ AUTOMATIC COPY SUCCESSFUL!") + print(f"📁 {source_file} → {target_file}") + except Exception as e: + print(f"\n⚠️ Automatic copy failed: {e}") + + print(f"\n🎉 Generation completed!") + print(f"📁 Files created:") + for f in [output_2d, output_3d]: + if os.path.exists(f): + size = os.path.getsize(f) / 1024 / 1024 + print(f" - {f} ({size:.1f} MB)") + +if __name__ == "__main__": + main() diff --git a/app/src/content/embeds/banner-molecules.html b/app/src/content/embeds/banner-molecules.html new file mode 100644 index 0000000000000000000000000000000000000000..1458d5f585aef9e856de988375c6c34abe0ff5ff --- /dev/null +++ b/app/src/content/embeds/banner-molecules.html @@ -0,0 +1,522 @@ +
        + \ No newline at end of file diff --git a/app/src/content/embeds/banner-neural-network-animejs.html b/app/src/content/embeds/banner-neural-network-animejs.html new file mode 100644 index 0000000000000000000000000000000000000000..93e4acd30dcc4dbca6e798c6b8d1f2390da73064 --- /dev/null +++ b/app/src/content/embeds/banner-neural-network-animejs.html @@ -0,0 +1,464 @@ +
        + \ No newline at end of file diff --git a/app/src/content/embeds/banner-threejs-galaxy.html b/app/src/content/embeds/banner-threejs-galaxy.html new file mode 100644 index 0000000000000000000000000000000000000000..3ca4fde3d6ca705c69076a5f43dcc0b9d84a46c8 --- /dev/null +++ b/app/src/content/embeds/banner-threejs-galaxy.html @@ -0,0 +1,504 @@ +
        + + + \ No newline at end of file diff --git a/app/src/content/embeds/banner-umap-lucioles.html b/app/src/content/embeds/banner-umap-lucioles.html new file mode 100644 index 0000000000000000000000000000000000000000..1e3774ba8699703ae5acfe6f4cf8eb05b8ffa0e6 --- /dev/null +++ b/app/src/content/embeds/banner-umap-lucioles.html @@ -0,0 +1,489 @@ +
        + + \ No newline at end of file diff --git a/app/src/content/embeds/banner.html b/app/src/content/embeds/banner.html new file mode 100644 index 0000000000000000000000000000000000000000..ceef7fc3c54f2be7e635e7a14b21ff082b4897a3 --- /dev/null +++ b/app/src/content/embeds/banner.html @@ -0,0 +1,1235 @@ +
        +

        The benchmark lifecycle

        +
        +
        + + diff --git a/app/src/content/embeds/d3-ablation-workflow.html b/app/src/content/embeds/d3-ablation-workflow.html new file mode 100644 index 0000000000000000000000000000000000000000..97032fdc78a1af19afb713a1e3a486cd1126efd7 --- /dev/null +++ b/app/src/content/embeds/d3-ablation-workflow.html @@ -0,0 +1,474 @@ +
        + + + + diff --git a/app/src/content/embeds/d3-bar.html b/app/src/content/embeds/d3-bar.html new file mode 100644 index 0000000000000000000000000000000000000000..b5d311bc2a6aa58cf80996ffd27f6e1e694d3243 --- /dev/null +++ b/app/src/content/embeds/d3-bar.html @@ -0,0 +1,459 @@ +
        + + \ No newline at end of file diff --git a/app/src/content/embeds/d3-benchmark.html b/app/src/content/embeds/d3-benchmark.html new file mode 100644 index 0000000000000000000000000000000000000000..99b995dac556902cb083b35fb818571d3e19c374 --- /dev/null +++ b/app/src/content/embeds/d3-benchmark.html @@ -0,0 +1,434 @@ +
        + + + + diff --git a/app/src/content/embeds/d3-binary-metrics.html b/app/src/content/embeds/d3-binary-metrics.html new file mode 100644 index 0000000000000000000000000000000000000000..689fc4d3d361799d7770f76813ca6c0d2d9a2a96 --- /dev/null +++ b/app/src/content/embeds/d3-binary-metrics.html @@ -0,0 +1,400 @@ +
        + + + + diff --git a/app/src/content/embeds/d3-confusion-matrix.html b/app/src/content/embeds/d3-confusion-matrix.html new file mode 100644 index 0000000000000000000000000000000000000000..944807d8cadc0e37db2bd6d04b48fb9f2bcb1b6e --- /dev/null +++ b/app/src/content/embeds/d3-confusion-matrix.html @@ -0,0 +1,516 @@ +
        + + + + diff --git a/app/src/content/embeds/d3-decision-tree.html b/app/src/content/embeds/d3-decision-tree.html new file mode 100644 index 0000000000000000000000000000000000000000..a2169adc8d15e5b34e1feffa635b252a2ad6ccc7 --- /dev/null +++ b/app/src/content/embeds/d3-decision-tree.html @@ -0,0 +1,363 @@ +
        + + diff --git a/app/src/content/embeds/d3-equation-editor.html b/app/src/content/embeds/d3-equation-editor.html new file mode 100644 index 0000000000000000000000000000000000000000..97cbd03d8e8a7988133b33f6da5c849228d4ff34 --- /dev/null +++ b/app/src/content/embeds/d3-equation-editor.html @@ -0,0 +1,677 @@ +
        + + \ No newline at end of file diff --git a/app/src/content/embeds/d3-evaluation-decision-tree.html b/app/src/content/embeds/d3-evaluation-decision-tree.html new file mode 100644 index 0000000000000000000000000000000000000000..9d524246cf28cbacd17aee0488fb672ffbe247a7 --- /dev/null +++ b/app/src/content/embeds/d3-evaluation-decision-tree.html @@ -0,0 +1,333 @@ +
        + + diff --git a/app/src/content/embeds/d3-human-biases.html b/app/src/content/embeds/d3-human-biases.html new file mode 100644 index 0000000000000000000000000000000000000000..c7b5869fc23fefff20bc287f32829c24ce2b8bd9 --- /dev/null +++ b/app/src/content/embeds/d3-human-biases.html @@ -0,0 +1,352 @@ +
        + + + + diff --git a/app/src/content/embeds/d3-intro-boxes.html b/app/src/content/embeds/d3-intro-boxes.html new file mode 100644 index 0000000000000000000000000000000000000000..2abb4812cbc08a22e6ec0e66f7c788741fd33622 --- /dev/null +++ b/app/src/content/embeds/d3-intro-boxes.html @@ -0,0 +1,161 @@ +
        + + + + diff --git a/app/src/content/embeds/d3-line-quad.html b/app/src/content/embeds/d3-line-quad.html new file mode 100644 index 0000000000000000000000000000000000000000..b7e275ca7e6f83b15f7d6b7c6dd83a3bdcd2bd2e --- /dev/null +++ b/app/src/content/embeds/d3-line-quad.html @@ -0,0 +1,783 @@ +
        + +
        +
        +
        +
        +
        +
        +
        + + +
        + + \ No newline at end of file diff --git a/app/src/content/embeds/d3-llm-biases.html b/app/src/content/embeds/d3-llm-biases.html new file mode 100644 index 0000000000000000000000000000000000000000..52a1964ceaa08cad8fc7dbdbbb1df1bbf7078b0e --- /dev/null +++ b/app/src/content/embeds/d3-llm-biases.html @@ -0,0 +1,378 @@ +
        + + + + diff --git a/app/src/content/embeds/d3-matrix.html b/app/src/content/embeds/d3-matrix.html new file mode 100644 index 0000000000000000000000000000000000000000..0d76ac4dbde93deb809cec26e1d65ebccc460227 --- /dev/null +++ b/app/src/content/embeds/d3-matrix.html @@ -0,0 +1,524 @@ +
        + + \ No newline at end of file diff --git a/app/src/content/embeds/d3-metrics-comparison.html b/app/src/content/embeds/d3-metrics-comparison.html new file mode 100644 index 0000000000000000000000000000000000000000..4b56228065f57106e2c2cece162f98adf4a4d030 --- /dev/null +++ b/app/src/content/embeds/d3-metrics-comparison.html @@ -0,0 +1,572 @@ +
        + + + + diff --git a/app/src/content/embeds/d3-mmlu-heatmap.html b/app/src/content/embeds/d3-mmlu-heatmap.html new file mode 100644 index 0000000000000000000000000000000000000000..d97653d36f184936742b2debb9561ab755957817 --- /dev/null +++ b/app/src/content/embeds/d3-mmlu-heatmap.html @@ -0,0 +1,516 @@ +
        +
        +
        +
        + + diff --git a/app/src/content/embeds/d3-neural-network.html b/app/src/content/embeds/d3-neural-network.html new file mode 100644 index 0000000000000000000000000000000000000000..b721173977c47d028a3d5e7f5ac6150348742538 --- /dev/null +++ b/app/src/content/embeds/d3-neural-network.html @@ -0,0 +1,951 @@ +
        + + \ No newline at end of file diff --git a/app/src/content/embeds/d3-pie-quad.html b/app/src/content/embeds/d3-pie-quad.html new file mode 100644 index 0000000000000000000000000000000000000000..314d16e9870f8a4a4845e2e0257221d5fa643175 --- /dev/null +++ b/app/src/content/embeds/d3-pie-quad.html @@ -0,0 +1,346 @@ +
        + + + + + + + + diff --git a/app/src/content/embeds/d3-pie.html b/app/src/content/embeds/d3-pie.html new file mode 100644 index 0000000000000000000000000000000000000000..594e016ef229dfae93655bcda2a059dc9cd49f35 --- /dev/null +++ b/app/src/content/embeds/d3-pie.html @@ -0,0 +1,262 @@ +
        + + \ No newline at end of file diff --git a/app/src/content/embeds/d3-precision-recall.html b/app/src/content/embeds/d3-precision-recall.html new file mode 100644 index 0000000000000000000000000000000000000000..037a59a9b563e22ead4242807e296f42c1a59830 --- /dev/null +++ b/app/src/content/embeds/d3-precision-recall.html @@ -0,0 +1,348 @@ +
        + + + + diff --git a/app/src/content/embeds/d3-quality-management.html b/app/src/content/embeds/d3-quality-management.html new file mode 100644 index 0000000000000000000000000000000000000000..8f1b819f278973ee51f19eb28db98078dfb9c020 --- /dev/null +++ b/app/src/content/embeds/d3-quality-management.html @@ -0,0 +1,254 @@ +
        + + + + diff --git a/app/src/content/embeds/d3-sampling-metrics.html b/app/src/content/embeds/d3-sampling-metrics.html new file mode 100644 index 0000000000000000000000000000000000000000..09a4aed612c2d0d68fbccbf8f12e59fbdb286bc5 --- /dev/null +++ b/app/src/content/embeds/d3-sampling-metrics.html @@ -0,0 +1,513 @@ +
        + + + + diff --git a/app/src/content/embeds/d3-scatter.html b/app/src/content/embeds/d3-scatter.html new file mode 100644 index 0000000000000000000000000000000000000000..610baf1e52e9e65ab5e15e103f8bfe97128c97c9 --- /dev/null +++ b/app/src/content/embeds/d3-scatter.html @@ -0,0 +1,300 @@ +
        + + + + diff --git a/app/src/content/embeds/d3-text-metrics.html b/app/src/content/embeds/d3-text-metrics.html new file mode 100644 index 0000000000000000000000000000000000000000..a9b6f8a20c87851f384d73957a05952ae9c83bd7 --- /dev/null +++ b/app/src/content/embeds/d3-text-metrics.html @@ -0,0 +1,497 @@ +
        + + + + diff --git a/app/src/content/embeds/d3-tokenization-timeline.html b/app/src/content/embeds/d3-tokenization-timeline.html new file mode 100644 index 0000000000000000000000000000000000000000..22ab8ccc8fbc53509bb1d7078fad3cff9b8f41f5 --- /dev/null +++ b/app/src/content/embeds/d3-tokenization-timeline.html @@ -0,0 +1,266 @@ +
        + + + + + + + + + + + Raw Text + Early Models (before 2022) + + +
        +
        Translate to French:
        +
        Hello world
        +
        +
        +
        + + + + Evolution + + + + + Chat Templates (in JSON) + Chat Models (2022-2025) + + +
        +
        {
        +
        "role": "system",
        +
        "content": "You are..."
        +
        },
        +
        {
        +
        "role": "user",
        +
        "content": "Hello"
        +
        }
        +
        +
        +
        + + + + Evolution + + + + + JSON + XML + Reasoning Models (2025+) + + +
        +
        {
        +
        "role": "assistant",
        +
        "content": [
        +
        <thinking>
        +
        reasoning...
        +
        </thinking>
        +
        <output>
        +
        response
        +
        </output>
        +
        ]
        +
        }
        +
        +
        +
        + + + + • Simple prompts + • Generally no structure + • Completion-based + + + + • Role separation + • Chat/Turn-based + + + + • Chat/Turn-based + with added tags for control + + + + + + + + + Before 2022 + 2022-2025 + 2025+ +
        +
        + + \ No newline at end of file diff --git a/app/src/content/embeds/d3-tokenization.html b/app/src/content/embeds/d3-tokenization.html new file mode 100644 index 0000000000000000000000000000000000000000..ad25bc178d1da021406bf8d395a1930ee50009c9 --- /dev/null +++ b/app/src/content/embeds/d3-tokenization.html @@ -0,0 +1,168 @@ +
        + + + + + + + + + + Input Text + "Hello, world!" + + + + + + + Tokenizer + Split into tokens + + + + + + + Tokens + + + + Hello + + + , + + + world + + + ! + + + [5425] + [11] + [1917] + [0] + + + + + + + Language Model + + + + + + + + + + Process & Generate + + + + + + + Output / Prediction + +
        + + \ No newline at end of file diff --git a/app/src/content/embeds/d3-two-lines-chart.html b/app/src/content/embeds/d3-two-lines-chart.html new file mode 100644 index 0000000000000000000000000000000000000000..ad33a2ddc244947ab936a6213f6a93ddc938a190 --- /dev/null +++ b/app/src/content/embeds/d3-two-lines-chart.html @@ -0,0 +1,1142 @@ + +
        + + + diff --git a/app/src/content/embeds/d3-umap-typography.html b/app/src/content/embeds/d3-umap-typography.html new file mode 100644 index 0000000000000000000000000000000000000000..47fbd89dfc09dee38527a9254cf134ec5438315f --- /dev/null +++ b/app/src/content/embeds/d3-umap-typography.html @@ -0,0 +1,804 @@ +
        + + + + + +
        +
        + + + \ No newline at end of file diff --git a/app/src/content/embeds/d3-vibe-checks.html b/app/src/content/embeds/d3-vibe-checks.html new file mode 100644 index 0000000000000000000000000000000000000000..57036d4b0f207f305f61335abc5c133d4ecab90d --- /dev/null +++ b/app/src/content/embeds/d3-vibe-checks.html @@ -0,0 +1,338 @@ +
        + + + + diff --git a/app/src/content/embeds/demo/content-structure.html b/app/src/content/embeds/demo/content-structure.html new file mode 100644 index 0000000000000000000000000000000000000000..c388ebb897e2528321df178dd9bb888fbe2ac055 --- /dev/null +++ b/app/src/content/embeds/demo/content-structure.html @@ -0,0 +1,161 @@ + + + + diff --git a/app/src/content/embeds/finetasks-monotonicity.html b/app/src/content/embeds/finetasks-monotonicity.html new file mode 100644 index 0000000000000000000000000000000000000000..9400fc1da81111eb5c63d8fd6dc05efb8134772b --- /dev/null +++ b/app/src/content/embeds/finetasks-monotonicity.html @@ -0,0 +1,521 @@ +
        +
        +
        +
        + + + + diff --git a/app/src/content/embeds/finetasks-ordering.html b/app/src/content/embeds/finetasks-ordering.html new file mode 100644 index 0000000000000000000000000000000000000000..8937e53a353b6819789431c89d1545b0451b0214 --- /dev/null +++ b/app/src/content/embeds/finetasks-ordering.html @@ -0,0 +1,521 @@ +
        +
        +
        +
        + + + + diff --git a/app/src/content/embeds/finetasks-randomness.html b/app/src/content/embeds/finetasks-randomness.html new file mode 100644 index 0000000000000000000000000000000000000000..94ef8e85ff9ef0d9ee84a4af80e61c2d2e12ded7 --- /dev/null +++ b/app/src/content/embeds/finetasks-randomness.html @@ -0,0 +1,521 @@ +
        +
        +
        +
        + + + + diff --git a/app/src/content/embeds/finetasks-snr.html b/app/src/content/embeds/finetasks-snr.html new file mode 100644 index 0000000000000000000000000000000000000000..cb95760b73a6df2e6c905f63317af31170e4c594 --- /dev/null +++ b/app/src/content/embeds/finetasks-snr.html @@ -0,0 +1,521 @@ +
        +
        +
        +
        + + + + diff --git a/app/src/content/embeds/rope-demo.html b/app/src/content/embeds/rope-demo.html new file mode 100644 index 0000000000000000000000000000000000000000..a5741b82375de2a369e72912eafa52557d528dfc --- /dev/null +++ b/app/src/content/embeds/rope-demo.html @@ -0,0 +1,532 @@ +
        + + + + diff --git a/app/src/content/embeds/smol-playbook/attention-mechanisms.html b/app/src/content/embeds/smol-playbook/attention-mechanisms.html new file mode 100644 index 0000000000000000000000000000000000000000..23fcfba0cccdd21abdc8815a527e6c99909420be --- /dev/null +++ b/app/src/content/embeds/smol-playbook/attention-mechanisms.html @@ -0,0 +1,711 @@ +
        + + + + + + + \ No newline at end of file diff --git a/app/src/content/embeds/smol-playbook/aws-bandwidth-bottleneck.html b/app/src/content/embeds/smol-playbook/aws-bandwidth-bottleneck.html new file mode 100644 index 0000000000000000000000000000000000000000..e8a92a9cfc7be75d6dca5905ca414c490a31642f --- /dev/null +++ b/app/src/content/embeds/smol-playbook/aws-bandwidth-bottleneck.html @@ -0,0 +1,2659 @@ + +
        +
        +
        +
        +
        +
        +
        Bandwidth Max
        +
        for CPU → GPU
        +
        -
        +
        GB/s
        + + +
        +
        + + + + + \ No newline at end of file diff --git a/app/src/content/embeds/smol-playbook/gds-interactive-heatmaps.html b/app/src/content/embeds/smol-playbook/gds-interactive-heatmaps.html new file mode 100644 index 0000000000000000000000000000000000000000..9eb3a901924ab7ceabc33186901f169183c88fb2 --- /dev/null +++ b/app/src/content/embeds/smol-playbook/gds-interactive-heatmaps.html @@ -0,0 +1,560 @@ +
        + + \ No newline at end of file diff --git a/app/src/content/embeds/smol-playbook/generic-d3-line-chart.html b/app/src/content/embeds/smol-playbook/generic-d3-line-chart.html new file mode 100644 index 0000000000000000000000000000000000000000..a77794a15d16a3ec7f8da944a134fd0d01f71fa2 --- /dev/null +++ b/app/src/content/embeds/smol-playbook/generic-d3-line-chart.html @@ -0,0 +1,1184 @@ + +
        + + \ No newline at end of file diff --git a/app/src/content/embeds/smol-playbook/generic-d3-six-line-charts.html b/app/src/content/embeds/smol-playbook/generic-d3-six-line-charts.html new file mode 100644 index 0000000000000000000000000000000000000000..1d6340dd1546474aa78f6ab649922c42c1c7bdef --- /dev/null +++ b/app/src/content/embeds/smol-playbook/generic-d3-six-line-charts.html @@ -0,0 +1,874 @@ + +
        + + \ No newline at end of file diff --git a/app/src/content/embeds/smol-playbook/model-architecture-decision-flowchart.html b/app/src/content/embeds/smol-playbook/model-architecture-decision-flowchart.html new file mode 100644 index 0000000000000000000000000000000000000000..3b053b3ac491f79b3e1e3b5d79147ed954007401 --- /dev/null +++ b/app/src/content/embeds/smol-playbook/model-architecture-decision-flowchart.html @@ -0,0 +1,490 @@ + +
        + + \ No newline at end of file diff --git a/app/src/content/embeds/smol-playbook/parameter-calculator.html b/app/src/content/embeds/smol-playbook/parameter-calculator.html new file mode 100644 index 0000000000000000000000000000000000000000..d1b63566798529ea345b4dafd20f9ae913564450 --- /dev/null +++ b/app/src/content/embeds/smol-playbook/parameter-calculator.html @@ -0,0 +1,616 @@ +
        + + + + \ No newline at end of file diff --git a/app/src/content/embeds/smol-playbook/parameter-comparison.html b/app/src/content/embeds/smol-playbook/parameter-comparison.html new file mode 100644 index 0000000000000000000000000000000000000000..ac012fe9f9baa3b5b8d453f1d037fc8e605d86fd --- /dev/null +++ b/app/src/content/embeds/smol-playbook/parameter-comparison.html @@ -0,0 +1,807 @@ +
        + + + + \ No newline at end of file diff --git a/app/src/content/embeds/smol-playbook/post-training-adventure.html b/app/src/content/embeds/smol-playbook/post-training-adventure.html new file mode 100644 index 0000000000000000000000000000000000000000..e49c890c6eb17e7c5aa3f5b2a2d96bd0bc739a81 --- /dev/null +++ b/app/src/content/embeds/smol-playbook/post-training-adventure.html @@ -0,0 +1,381 @@ + + + + + + + Post-Training Adventure + + + + +
        +
        +
        + + + + + + + + \ No newline at end of file diff --git a/app/src/content/embeds/smol-playbook/train-model-decision-flowchart.html b/app/src/content/embeds/smol-playbook/train-model-decision-flowchart.html new file mode 100644 index 0000000000000000000000000000000000000000..ecf66a59d56d8f871f7e4865ea963e0972514977 --- /dev/null +++ b/app/src/content/embeds/smol-playbook/train-model-decision-flowchart.html @@ -0,0 +1,522 @@ + +
        + + \ No newline at end of file diff --git a/app/src/content/embeds/typography/1-download-fonts.mjs b/app/src/content/embeds/typography/1-download-fonts.mjs new file mode 100755 index 0000000000000000000000000000000000000000..b0d2378db4c87ec2fb39b0830af512f01a61a28d --- /dev/null +++ b/app/src/content/embeds/typography/1-download-fonts.mjs @@ -0,0 +1,531 @@ +#!/usr/bin/env node + +import fs from 'fs/promises'; +import path from 'path'; +import { fileURLToPath } from 'url'; +import opentype from 'opentype.js'; +import fonteditor from 'fonteditor-core'; + +const __filename = fileURLToPath(import.meta.url); +const __dirname = path.dirname(__filename); + +// Configuration +const GOOGLE_FONTS_API_KEY = process.env.GOOGLE_FONTS_API_KEY; +const GOOGLE_FONTS_API_URL = 'https://www.googleapis.com/webfonts/v1/webfonts'; +const TYPOGRAPHY_BASE = __dirname; +const GENERATED_DIR = path.join(TYPOGRAPHY_BASE, 'generated'); +const FONTS_DIR = path.join(GENERATED_DIR, 'fonts'); +const SVGS_DIR = path.join(GENERATED_DIR, 'svgs'); +const FONT_MANIFEST_PATH = path.join(GENERATED_DIR, 'data', 'font_manifest.json'); +const TYPOGRAPHY_DATA_PATH = path.join(GENERATED_DIR, 'data', 'typography_data.json'); + +/** + * Downloads the Google Fonts list + */ +async function fetchGoogleFontsList() { + // 1. Try Google Fonts API with key if available + if (GOOGLE_FONTS_API_KEY && GOOGLE_FONTS_API_KEY !== 'YOUR_API_KEY_HERE') { + try { + console.log('🔍 Fetching from Google Fonts API (with key)...'); + const url = `${GOOGLE_FONTS_API_URL}?key=${GOOGLE_FONTS_API_KEY}&sort=popularity`; + + const response = await fetch(url); + if (!response.ok) { + throw new Error(`HTTP ${response.status}`); + } + + const data = await response.json(); + const fonts = data.items || []; + + console.log(`✅ ${fonts.length} fonts retrieved from official API`); + + return fonts.map(font => ({ + family: font.family, + category: font.category || 'sans-serif', + files: font.files || {} + })); + + } catch (error) { + console.error('❌ Google Fonts API error:', error.message); + } + } + + // 2. Try fontsource google-font-metadata (without key) + try { + console.log('🔍 Attempting via fontsource google-font-metadata...'); + + const metadataUrl = 'https://raw.githubusercontent.com/fontsource/google-font-metadata/main/data/google-fonts-v1.json'; + + console.log(`📥 Attempting: ${metadataUrl}`); + const response = await fetch(metadataUrl, { timeout: 15000 }); + + if (response.ok) { + const data = await response.json(); + + // Fontsource data structure: { "font-id": { family: "Font Name", category: "sans-serif", ... }, ... } + if (data && typeof data === 'object') { + const fonts = Object.values(data).map(font => ({ + family: font.family, + category: font.category || 'sans-serif', + files: { + regular: `https://fonts.googleapis.com/css2?family=${encodeURIComponent(font.family)}:wght@400&display=swap` + } + })); + + console.log(`✅ ${fonts.length} fonts retrieved from fontsource metadata`); + return fonts; + } + } else { + console.log(`⚠️ Failed ${metadataUrl}: HTTP ${response.status}`); + } + + } catch (error) { + console.log('⚠️ Fontsource metadata not available:', error.message); + } + + // 3. Fallback: existing local manifest + + // Fallback: use existing manifest + try { + const manifestData = await fs.readFile(FONT_MANIFEST_PATH, 'utf-8'); + const manifest = JSON.parse(manifestData); + const fontNames = Object.keys(manifest); + + console.log(`📚 ${fontNames.length} fonts found in manifest`); + + return fontNames.map(name => ({ + family: name, + files: { + regular: `https://fonts.googleapis.com/css2?family=${encodeURIComponent(name)}:wght@400&display=swap` + } + })); + } catch (error) { + console.error('❌ Error reading manifest:', error.message); + + // Ultimate fallback: use fallback fonts + console.log('🔄 Using fallback fonts...'); + return getFallbackFonts(); + } +} + +// Add function for fallback fonts +function getFallbackFonts() { + const fallbackFonts = [ + "Roboto", "Open Sans", "Lato", "Montserrat", "Source Sans Pro", + "Playfair Display", "Lora", "Crimson Text", "Merriweather", + "Fira Code", "Source Code Pro", "JetBrains Mono", "Roboto Mono", + "Dancing Script", "Pacifico", "Caveat", "Oswald", "Bebas Neue" + ]; + + return fallbackFonts.map(name => ({ + family: name, + files: { + regular: `https://fonts.googleapis.com/css2?family=${encodeURIComponent(name)}:wght@400&display=swap` + } + })); +} + +/** + * Extracts WOFF2 URL from Google Fonts CSS response + */ +async function extractWOFF2Url(cssUrl, fontFamily) { + try { + console.log(`📥 CSS request for ${fontFamily}...`); + + const response = await fetch(cssUrl, { + headers: { + // Old User-Agent to force TTF/WOFF instead of WOFF2 + 'User-Agent': 'Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0)', + 'Accept': 'text/css,*/*;q=0.1' + }, + timeout: 10000 + }); + + if (!response.ok) { + throw new Error(`HTTP ${response.status} - ${response.statusText}`); + } + + const css = await response.text(); + + if (!css || css.trim().length === 0) { + throw new Error('Empty CSS response'); + } + + console.log(`📄 CSS received (${css.length} characters)`); + if (css.includes('font-family')) { + console.log(`✅ Valid CSS for ${fontFamily}`); + } else { + console.log(`⚠️ Suspicious CSS for ${fontFamily} - no font-family found`); + } + + // Look for TTF first (most compatible) + const ttfMatch = css.match(/url\((https:\/\/fonts\.gstatic\.com\/[^)]+\.ttf)\)/); + if (ttfMatch) { + return { url: ttfMatch[1], format: 'ttf' }; + } + + // Look for WOFF (compatible with opentype.js) + const woffMatch = css.match(/url\((https:\/\/fonts\.gstatic\.com\/[^)]+\.woff)\)/); + if (woffMatch) { + return { url: woffMatch[1], format: 'woff' }; + } + + // Look for WOFF2 as last resort + const woff2Match = css.match(/url\((https:\/\/fonts\.gstatic\.com\/[^)]+\.woff2)\)/); + if (woff2Match) { + return { url: woff2Match[1], format: 'woff2' }; + } + + throw new Error('No font file found in CSS'); + } catch (error) { + console.error(`❌ Error extracting font URL for ${fontFamily}:`, error.message); + return null; + } +} + +/** + * Downloads and converts a Google Font to TTF + */ +async function downloadAndConvertGoogleFont(fontFamily, outputPath) { + try { + // Clean and properly encode the family name + const cleanFontFamily = fontFamily.trim(); + const encodedFamily = encodeURIComponent(cleanFontFamily); + + // Build Google Fonts CSS URL with special character handling + const cssUrl = `https://fonts.googleapis.com/css2?family=${encodedFamily}:wght@400&display=swap`; + + console.log(`🔍 Extracting font URL from Google Fonts for "${cleanFontFamily}"...`); + console.log(`🔗 CSS URL: ${cssUrl}`); + + const fontInfo = await extractWOFF2Url(cssUrl, cleanFontFamily); + + if (!fontInfo) { + throw new Error('Font URL not found'); + } + + console.log(`📥 Downloading ${fontInfo.format.toUpperCase()} from Google Fonts...`); + const response = await fetch(fontInfo.url); + + if (!response.ok) { + throw new Error(`HTTP ${response.status}`); + } + + const fontBuffer = await response.arrayBuffer(); + + if (fontInfo.format === 'ttf') { + // Already TTF, save directly + await fs.writeFile(outputPath, Buffer.from(fontBuffer)); + console.log(`✅ TTF font saved directly`); + } else if (fontInfo.format === 'woff2') { + // Convert WOFF2 to TTF + console.log(`🔄 Converting WOFF2 to TTF...`); + try { + const font = fonteditor.woff2.decode(Buffer.from(fontBuffer)); + const ttfBuffer = fonteditor.ttf.encode(font); + await fs.writeFile(outputPath, ttfBuffer); + console.log(`✅ Font converted and saved as TTF`); + } catch (conversionError) { + throw new Error(`WOFF2 conversion failed: ${conversionError.message}`); + } + } else if (fontInfo.format === 'woff') { + // WOFF version 1 - opentype.js can handle it directly + await fs.writeFile(outputPath, Buffer.from(fontBuffer)); + console.log(`✅ WOFF font saved (opentype.js can read it)`); + } + + return true; + + } catch (error) { + console.error(`❌ Error during download/conversion for ${fontFamily}:`, error.message); + return false; + } +} + + +/** + * Generates an SVG of letter A from a font + */ +async function generateLetterASVG(fontPath, fontFamily) { + try { + const fontBuffer = await fs.readFile(fontPath); + const font = opentype.parse(fontBuffer.buffer); + + // Get the glyph for letter 'A' + const glyph = font.charToGlyph('A'); + + if (!glyph || !glyph.path) { + throw new Error('Glyph A not found or without path'); + } + + // Configuration uniforme + const SVG_SIZE = 80; // Taille fixe 80x80 + const fontSize = 60; // Reduced font size to leave margins + + // Get glyph dimensions + const tempPath = glyph.getPath(0, 0, fontSize); + const bbox = tempPath.getBoundingBox(); + + // Calculate actual glyph dimensions + const glyphWidth = bbox.x2 - bbox.x1; + const glyphHeight = bbox.y2 - bbox.y1; + + // Center perfectly in 80x80 canvas + const centerX = SVG_SIZE / 2; + const centerY = SVG_SIZE / 2; + + // Position glyph to be centered + const offsetX = centerX - (bbox.x1 + glyphWidth / 2); + const offsetY = centerY - (bbox.y1 + glyphHeight / 2); + + // Generate final centered path + const adjustedPath = glyph.getPath(offsetX, offsetY, fontSize); + + // Generate SVG with fixed dimensions + const svgPathData = adjustedPath.toPathData(2); + const svg = ` + +`; + + return { + svg, + width: SVG_SIZE, + height: SVG_SIZE, + fontMetrics: { + unitsPerEm: font.unitsPerEm, + ascender: font.ascender, + descender: font.descender + } + }; + + } catch (error) { + console.error(`❌ Error generating SVG for ${fontFamily}:`, error.message); + return null; + } +} + +/** + * Validates that a font name is compatible with Google Fonts + */ +function validateFontName(fontName) { + if (!fontName || typeof fontName !== 'string') { + return { valid: false, reason: 'Empty or invalid name' }; + } + + const trimmed = fontName.trim(); + if (trimmed.length === 0) { + return { valid: false, reason: 'Empty name after cleanup' }; + } + + if (trimmed.length > 100) { + return { valid: false, reason: 'Name too long' }; + } + + // Problematic characters for Google Fonts URLs + const problematicChars = /[<>'"&]/; + if (problematicChars.test(trimmed)) { + return { valid: false, reason: 'Problematic characters detected' }; + } + + return { valid: true, cleaned: trimmed }; +} + +/** + * Converts a font name to usable ID + */ +function fontNameToId(fontName) { + return fontName + .toLowerCase() + .replace(/[^a-z0-9]+/g, '_') + .replace(/^_|_$/g, ''); +} + +/** + * Processes a font: download and SVG generation + */ +async function processFont(fontData, index, total) { + const fontFamily = fontData.family; + + console.log(`\n[${index + 1}/${total}] 🔄 Processing "${fontFamily}"...`); + + // Font name validation + const validation = validateFontName(fontFamily); + if (!validation.valid) { + console.error(`❌ Invalid font "${fontFamily}": ${validation.reason}`); + return { + fontFamily, + fontId: fontNameToId(fontFamily), + status: 'error', + error: `Invalid name: ${validation.reason}` + }; + } + + const cleanFontFamily = validation.cleaned; + const fontId = fontNameToId(cleanFontFamily); + + // File paths + const fontPath = path.join(FONTS_DIR, `${fontId}.ttf`); + + try { + // Download font directly + console.log(`⬇️ Downloading ${fontFamily} from Google Fonts...`); + const downloadSuccess = await downloadAndConvertGoogleFont(fontFamily, fontPath); + + if (!downloadSuccess) { + throw new Error('Download/conversion from Google Fonts failed'); + } + + console.log(`✅ Font downloaded and ready: ${fontFamily}`); + + return { + fontFamily, + fontId, + status: 'downloaded', + fontPath: fontPath + }; + + } catch (error) { + console.error(`❌ Error for ${fontFamily}:`, error.message); + return { + fontFamily, + fontId, + status: 'error', + error: error.message + }; + } +} + +/** + * Updates font manifest with new SVGs + */ +async function updateFontManifest(results) { + try { + console.log('\n📝 Updating font manifest...'); + + // Read existing manifest + let manifest = {}; + try { + const manifestData = await fs.readFile(FONT_MANIFEST_PATH, 'utf-8'); + manifest = JSON.parse(manifestData); + } catch { + // Create new manifest if none exists + } + + // Read existing typography data + let typographyData = []; + try { + const typographyDataContent = await fs.readFile(TYPOGRAPHY_DATA_PATH, 'utf-8'); + typographyData = JSON.parse(typographyDataContent); + } catch { + // Use empty array if no data exists + } + + // Update with new results + const successfulResults = results.filter(r => r.status === 'downloaded'); + + for (const result of successfulResults) { + const { fontFamily, fontId, svgPath, dimensions, fontMetrics } = result; + + // Find corresponding typography data + const typographyEntry = typographyData.find(entry => entry.name === fontFamily); + const family = typographyEntry?.family || 'sans-serif'; + + // Update manifest + manifest[fontFamily] = { + id: fontId, + family: family, + images: { + A: svgPath, + a: svgPath // Use same SVG for lowercase and uppercase for now + }, + svg: { + A: { + path: svgPath, + width: dimensions.width, + height: dimensions.height, + viewBox: `0 0 ${dimensions.width} ${dimensions.height}` + } + }, + fontMetrics: fontMetrics + }; + } + + // Save updated manifest + await fs.writeFile(FONT_MANIFEST_PATH, JSON.stringify(manifest, null, 2), 'utf-8'); + + console.log(`✅ Manifest updated with ${successfulResults.length} fonts`); + + } catch (error) { + console.error('❌ Error updating manifest:', error.message); + } +} + +/** + * Main function + */ +async function main() { + console.log('🚀 Generating Google Fonts SVGs\n'); + + try { + // Create necessary directories + await fs.mkdir(FONTS_DIR, { recursive: true }); + await fs.mkdir(SVGS_DIR, { recursive: true }); + await fs.mkdir(path.dirname(FONT_MANIFEST_PATH), { recursive: true }); + + // Get fonts list + console.log('📋 Fetching fonts list...'); + const fonts = await fetchGoogleFontsList(); + + if (fonts.length === 0) { + console.error('❌ No fonts found'); + process.exit(1); + } + + console.log(`📊 ${fonts.length} fonts found`); + + // Processing 300 fonts + const limitedFonts = fonts.slice(0, 300); + console.log(`🔬 Processing first ${limitedFonts.length} fonts`); + + // Process each font + const results = []; + for (let i = 0; i < limitedFonts.length; i++) { + const result = await processFont(limitedFonts[i], i, limitedFonts.length); + results.push(result); + + // Pause between requests to avoid rate limiting + if (i < limitedFonts.length - 1) { + await new Promise(resolve => setTimeout(resolve, 100)); + } + } + + // Note: Manifest will be updated in later steps when SVGs are available + + // Display final statistics + const downloaded = results.filter(r => r.status === 'downloaded').length; + const errors = results.filter(r => r.status === 'error').length; + + console.log('\n📊 Final statistics:'); + console.log(`✅ Downloaded fonts: ${downloaded}`); + console.log(`❌ Errors: ${errors}`); + console.log(`📋 Total processed: ${results.length}`); + + if (errors > 0) { + console.log('\n❌ Fonts with errors:'); + results + .filter(r => r.status === 'error') + .forEach(r => console.log(` - ${r.fontFamily}: ${r.error}`)); + } + + } catch (error) { + console.error('💥 Fatal error:', error.message); + process.exit(1); + } +} + +// Execute script if run directly +if (import.meta.url === `file://${process.argv[1]}`) { + main(); +} + +export { main, generateLetterASVG, fontNameToId }; diff --git a/app/src/content/embeds/typography/2-generate-svgs.mjs b/app/src/content/embeds/typography/2-generate-svgs.mjs new file mode 100644 index 0000000000000000000000000000000000000000..f1c9924fbe15249cd9b7a6fa60a6282fbdc940b8 --- /dev/null +++ b/app/src/content/embeds/typography/2-generate-svgs.mjs @@ -0,0 +1,265 @@ +#!/usr/bin/env node + +import fs from 'fs/promises'; +import path from 'path'; +import { fileURLToPath } from 'url'; +import opentype from 'opentype.js'; + +const __filename = fileURLToPath(import.meta.url); +const __dirname = path.dirname(__filename); + +// Configuration +const TYPOGRAPHY_BASE = __dirname; +const GENERATED_DIR = path.join(TYPOGRAPHY_BASE, 'generated'); +const FONTS_DIR = path.join(GENERATED_DIR, 'fonts'); +const SVGS_DIR = path.join(GENERATED_DIR, 'svgs'); +const FONT_MANIFEST_PATH = path.join(GENERATED_DIR, 'data', 'font_manifest.json'); +const TYPOGRAPHY_DATA_PATH = path.join(GENERATED_DIR, 'data', 'typography_data.json'); + +/** + * Updates font manifest with new SVGs + */ +async function updateFontManifest(results) { + try { + console.log('\n📝 Updating font manifest...'); + + // Read existing manifest + let manifest = {}; + try { + const manifestData = await fs.readFile(FONT_MANIFEST_PATH, 'utf-8'); + manifest = JSON.parse(manifestData); + } catch { + // Create new manifest if none exists + } + + // Read existing typography data + let typographyData = []; + try { + const typographyDataContent = await fs.readFile(TYPOGRAPHY_DATA_PATH, 'utf-8'); + const data = JSON.parse(typographyDataContent); + typographyData = data.fonts || []; + } catch { + // Use empty array if no data exists + } + + // Update with new results + const successfulResults = results.filter(r => r.status === 'success'); + + for (const result of successfulResults) { + const { fontFamily, fontId, svgPath, dimensions, fontMetrics } = result; + + // Find corresponding typography data + const typographyEntry = typographyData.find(entry => entry.name === fontFamily); + const family = typographyEntry?.family || 'sans-serif'; + + // Update manifest + manifest[fontFamily] = { + id: fontId, + family: family, + images: { + A: svgPath, + a: svgPath // Use same SVG for lowercase and uppercase for now + }, + svg: { + A: { + path: svgPath, + width: dimensions.width, + height: dimensions.height, + viewBox: `0 0 ${dimensions.width} ${dimensions.height}` + } + }, + fontMetrics: fontMetrics + }; + } + + // Ensure data directory exists + await fs.mkdir(path.dirname(FONT_MANIFEST_PATH), { recursive: true }); + + // Save updated manifest + await fs.writeFile(FONT_MANIFEST_PATH, JSON.stringify(manifest, null, 2), 'utf-8'); + + console.log(`✅ Manifest updated with ${successfulResults.length} fonts`); + + } catch (error) { + console.error('❌ Error updating manifest:', error.message); + } +} + +/** + * Generates an SVG of letter A from a font + */ +async function generateLetterASVG(fontPath, fontFamily) { + try { + const fontBuffer = await fs.readFile(fontPath); + const font = opentype.parse(fontBuffer.buffer); + + // Get the glyph for letter 'A' + const glyph = font.charToGlyph('A'); + + if (!glyph || !glyph.path) { + throw new Error('Glyph A not found or without path'); + } + + // Uniform configuration + const SVG_SIZE = 80; // Fixed size 80x80 + const fontSize = 60; // Reduced font size to leave margins + + // Get glyph dimensions + const tempPath = glyph.getPath(0, 0, fontSize); + const bbox = tempPath.getBoundingBox(); + + // Calculate actual glyph dimensions + const glyphWidth = bbox.x2 - bbox.x1; + const glyphHeight = bbox.y2 - bbox.y1; + + // Center perfectly in 80x80 canvas + const centerX = SVG_SIZE / 2; + const centerY = SVG_SIZE / 2; + + // Position glyph to be centered + const offsetX = centerX - (bbox.x1 + glyphWidth / 2); + const offsetY = centerY - (bbox.y1 + glyphHeight / 2); + + // Generate final centered path + const adjustedPath = glyph.getPath(offsetX, offsetY, fontSize); + + // Generate SVG with fixed dimensions + const svgPathData = adjustedPath.toPathData(2); + const svg = ` + +`; + + return { + svg, + width: SVG_SIZE, + height: SVG_SIZE, + fontMetrics: { + unitsPerEm: font.unitsPerEm, + ascender: font.ascender, + descender: font.descender + } + }; + + } catch (error) { + console.error(`❌ Error generating SVG for ${fontFamily}:`, error.message); + return null; + } +} + +/** + * Converts a font name to usable ID + */ +function fontNameToId(fontName) { + return fontName + .toLowerCase() + .replace(/[^a-z0-9]+/g, '_') + .replace(/^_|_$/g, ''); +} + +/** + * Generates SVGs for all fonts in the folder + */ +async function generateSVGsForAllFonts() { + console.log('🎨 Generating SVGs for all downloaded fonts\n'); + + try { + // Create SVG folder if necessary + await fs.mkdir(SVGS_DIR, { recursive: true }); + + // Read all TTF files + const fontFiles = await fs.readdir(FONTS_DIR); + const ttfFiles = fontFiles.filter(file => file.endsWith('.ttf')); + + if (ttfFiles.length === 0) { + console.error('❌ No TTF files found in', FONTS_DIR); + process.exit(1); + } + + console.log(`📁 Found ${ttfFiles.length} TTF files`); + + const results = []; + + for (let i = 0; i < ttfFiles.length; i++) { + const ttfFile = ttfFiles[i]; + const fontPath = path.join(FONTS_DIR, ttfFile); + + // Extract font name from filename + const fontId = ttfFile.replace('.ttf', ''); + const fontFamily = fontId.replace(/_/g, ' '); + + console.log(`\n[${i + 1}/${ttfFiles.length}] 🔄 Generating SVG for "${fontFamily}"...`); + + try { + // Generate SVG + const svgResult = await generateLetterASVG(fontPath, fontFamily); + + if (!svgResult) { + results.push({ + fontFamily, + fontId, + status: 'error', + error: 'SVG generation failed' + }); + continue; + } + + // Save SVG + const svgPath = path.join(SVGS_DIR, `${fontId}_a.svg`); + await fs.writeFile(svgPath, svgResult.svg, 'utf-8'); + + console.log(`✅ SVG generated: ${fontFamily} (${svgResult.width}x${svgResult.height})`); + + results.push({ + fontFamily, + fontId, + status: 'success', + svgPath: `/content/embeds/typography/font_svgs/${fontId}_a.svg`, + dimensions: { + width: svgResult.width, + height: svgResult.height + }, + fontMetrics: svgResult.fontMetrics + }); + + } catch (error) { + console.error(`❌ Error for ${fontFamily}:`, error.message); + results.push({ + fontFamily, + fontId, + status: 'error', + error: error.message + }); + } + } + + // Update font manifest + await updateFontManifest(results); + + // Display final statistics + const successful = results.filter(r => r.status === 'success').length; + const errors = results.filter(r => r.status === 'error').length; + + console.log('\n📊 Final statistics:'); + console.log(`✅ SVGs generated successfully: ${successful}`); + console.log(`❌ Errors: ${errors}`); + console.log(`📋 Total processed: ${results.length}`); + + if (errors > 0) { + console.log('\n❌ Fonts with errors:'); + results + .filter(r => r.status === 'error') + .forEach(r => console.log(` - ${r.fontFamily}: ${r.error}`)); + } + + } catch (error) { + console.error('💥 Fatal error:', error.message); + process.exit(1); + } +} + +// Execute script if run directly +if (import.meta.url === `file://${process.argv[1]}`) { + generateSVGsForAllFonts(); +} + +export { generateSVGsForAllFonts, generateLetterASVG }; \ No newline at end of file diff --git a/app/src/content/embeds/typography/3-generate-pngs.mjs b/app/src/content/embeds/typography/3-generate-pngs.mjs new file mode 100644 index 0000000000000000000000000000000000000000..7c3de9603f7c18412ed7f700d5d881e3e3c3a3a8 --- /dev/null +++ b/app/src/content/embeds/typography/3-generate-pngs.mjs @@ -0,0 +1,140 @@ +#!/usr/bin/env node + +import fs from 'fs/promises'; +import path from 'path'; +import { fileURLToPath } from 'url'; +import sharp from 'sharp'; + +const __filename = fileURLToPath(import.meta.url); +const __dirname = path.dirname(__filename); + +// Configuration +const TYPOGRAPHY_BASE = __dirname; +const GENERATED_DIR = path.join(TYPOGRAPHY_BASE, 'generated'); +const SVGS_DIR = path.join(GENERATED_DIR, 'svgs'); +const PNGS_DIR = path.join(GENERATED_DIR, 'pngs'); +const PNG_SIZE = 40; // Final size 40x40 pixels + +/** + * Converts an SVG to PNG + */ +async function convertSvgToPng(svgPath, pngPath) { + try { + const svgBuffer = await fs.readFile(svgPath); + + await sharp(svgBuffer) + .resize(PNG_SIZE, PNG_SIZE, { + fit: 'contain', + background: { r: 255, g: 255, b: 255, alpha: 1 } // White background + }) + .flatten({ background: { r: 255, g: 255, b: 255 } }) // Force white background + .png() + .toFile(pngPath); + + return true; + } catch (error) { + console.error(`❌ Error during conversion ${svgPath}:`, error.message); + return false; + } +} + +/** + * Generates PNGs for all SVGs + */ +async function generatePNGsForAllSVGs() { + console.log(`🖼️ Generating ${PNG_SIZE}x${PNG_SIZE} PNGs for all SVGs\n`); + + try { + // Create PNG folder if necessary + await fs.mkdir(PNGS_DIR, { recursive: true }); + + // Read all SVG files + const svgFiles = await fs.readdir(SVGS_DIR); + const svgFilesFiltered = svgFiles.filter(file => file.endsWith('.svg')); + + if (svgFilesFiltered.length === 0) { + console.error('❌ No SVG files found in', SVGS_DIR); + process.exit(1); + } + + console.log(`📁 Found ${svgFilesFiltered.length} SVG files`); + + const results = []; + + for (let i = 0; i < svgFilesFiltered.length; i++) { + const svgFile = svgFilesFiltered[i]; + const svgPath = path.join(SVGS_DIR, svgFile); + + // Create PNG filename + const pngFile = svgFile.replace('.svg', '.png'); + const pngPath = path.join(PNGS_DIR, pngFile); + + // Extract font name from filename + const fontId = svgFile.replace('_a.svg', ''); + const fontFamily = fontId.replace(/_/g, ' '); + + console.log(`[${i + 1}/${svgFilesFiltered.length}] 🔄 Converting "${fontFamily}"...`); + + try { + const success = await convertSvgToPng(svgPath, pngPath); + + if (success) { + console.log(`✅ PNG generated: ${fontFamily} (${PNG_SIZE}x${PNG_SIZE})`); + results.push({ + fontFamily, + fontId, + status: 'success', + pngPath: `/content/embeds/typography/font_pngs/${pngFile}`, + dimensions: { + width: PNG_SIZE, + height: PNG_SIZE + } + }); + } else { + results.push({ + fontFamily, + fontId, + status: 'error', + error: 'SVG to PNG conversion failed' + }); + } + + } catch (error) { + console.error(`❌ Error for ${fontFamily}:`, error.message); + results.push({ + fontFamily, + fontId, + status: 'error', + error: error.message + }); + } + } + + // Display final statistics + const successful = results.filter(r => r.status === 'success').length; + const errors = results.filter(r => r.status === 'error').length; + + console.log('\n📊 Final statistics:'); + console.log(`✅ PNGs generated successfully: ${successful}`); + console.log(`❌ Errors: ${errors}`); + console.log(`📋 Total processed: ${results.length}`); + + if (errors > 0) { + console.log('\n❌ Fonts with errors:'); + results + .filter(r => r.status === 'error') + .forEach(r => console.log(` - ${r.fontFamily}: ${r.error}`)); + } + + } catch (error) { + console.error('💥 Fatal error:', error.message); + process.exit(1); + } +} + +// Execute script if run directly +if (import.meta.url === `file://${process.argv[1]}`) { + generatePNGsForAllSVGs(); +} + +export { generatePNGsForAllSVGs }; \ No newline at end of file diff --git a/app/src/content/embeds/typography/4-generate-umap.py b/app/src/content/embeds/typography/4-generate-umap.py new file mode 100644 index 0000000000000000000000000000000000000000..d98a6c971cbe667c9b501358baa5c7282fdc9690 --- /dev/null +++ b/app/src/content/embeds/typography/4-generate-umap.py @@ -0,0 +1,262 @@ +#!/usr/bin/env python3 +""" +UMAP generator for typography fonts +Based on pixel matrices from generated PNGs +""" + +import umap +import numpy as np +import pandas as pd +import json +import os +import glob +from PIL import Image +from sklearn.preprocessing import StandardScaler +from datetime import datetime + +# Configuration +SCRIPT_DIR = os.path.dirname(os.path.abspath(__file__)) +GENERATED_DIR = os.path.join(SCRIPT_DIR, "generated") +PNGS_DIR = os.path.join(GENERATED_DIR, "pngs") +DATA_DIR = os.path.join(GENERATED_DIR, "data") +OUTPUT_FILENAME = "typography_data.json" +FULL_OUTPUT_PATH = os.path.join(DATA_DIR, OUTPUT_FILENAME) + +# UMAP parameters +UMAP_PARAMS = { + 'n_neighbors': 15, + 'min_dist': 1.0, + 'n_components': 2, + 'metric': 'euclidean', + 'random_state': 42 +} + +def load_png_as_matrix(png_path): + """ + Loads a PNG and converts it to a normalized pixel matrix + + Returns: + numpy.array: 1D vector of 1600 dimensions (40x40 flattened) + """ + try: + # Load image in grayscale + img = Image.open(png_path).convert('L') + + # Check dimensions + if img.size != (40, 40): + print(f"⚠️ Unexpected size for {png_path}: {img.size}") + img = img.resize((40, 40)) + + # Convert to numpy array and normalize (0-255 → 0-1) + pixel_matrix = np.array(img, dtype=np.float32) / 255.0 + + # Flatten to 1D vector + pixel_vector = pixel_matrix.flatten() + + return pixel_vector + + except Exception as e: + print(f"❌ Error loading {png_path}: {e}") + return None + +def extract_font_info_from_filename(filename): + """ + Extracts font information from filename + + Args: + filename: filename (e.g., "roboto_a.png") + + Returns: + dict: font information + """ + # Remove extension and "_a" suffix + font_id = filename.replace('.png', '').replace('_a', '') + font_name = font_id.replace('_', ' ').title() + + # Simple classification based on names + category = "sans-serif" # default + + # Classification rules based on names + serif_keywords = ['times', 'garamond', 'georgia', 'serif', 'baskerville', + 'caslon', 'merriweather', 'playfair', 'lora', 'crimson', + 'spectral', 'alegreya', 'cardo', 'vollkorn', 'gentium', + 'eb garamond', 'cormorant', 'libre baskerville'] + + script_keywords = ['script', 'cursive', 'brush', 'hand', 'dancing', + 'pacifico', 'satisfy', 'allura', 'tangerine', 'caveat', + 'sacramento', 'kaushan', 'alex brush', 'marck script'] + + mono_keywords = ['mono', 'code', 'courier', 'consola', 'inconsolata', + 'fira code', 'source code', 'jetbrains', 'roboto mono', + 'space mono', 'ubuntu mono', 'pt mono'] + + display_keywords = ['display', 'black', 'ultra', 'bebas', 'anton', 'oswald', + 'staatliches', 'bangers', 'fredoka', 'righteous', + 'russo one', 'alfa slab'] + + font_lower = font_name.lower() + + if any(keyword in font_lower for keyword in serif_keywords): + category = "serif" + elif any(keyword in font_lower for keyword in script_keywords): + category = "handwriting" + elif any(keyword in font_lower for keyword in mono_keywords): + category = "monospace" + elif any(keyword in font_lower for keyword in display_keywords): + category = "display" + + # Générer l'URL Google Fonts (utiliser le nom avec majuscules) + google_fonts_url = f"https://fonts.google.com/specimen/{font_name.replace(' ', '+')}" + + return { + "name": font_name, + "id": font_id, + "family": category, + "google_fonts_url": google_fonts_url + } + +def load_all_font_data(): + """ + Loads all font data from PNGs + + Returns: + tuple: (font_data_list, pixel_matrices) + """ + print("🔄 Loading font data from PNGs...") + + # Create data folder if necessary + os.makedirs(DATA_DIR, exist_ok=True) + + # Find all PNG files + png_pattern = os.path.join(PNGS_DIR, "*_a.png") + png_files = glob.glob(png_pattern) + + if not png_files: + raise FileNotFoundError(f"No PNG files found in {PNGS_DIR}") + + print(f"📁 Found {len(png_files)} PNG files") + + font_data_list = [] + pixel_matrices = [] + + for i, png_path in enumerate(png_files): + filename = os.path.basename(png_path) + + # Extract font info + font_info = extract_font_info_from_filename(filename) + + # Load pixel matrix + pixel_matrix = load_png_as_matrix(png_path) + + if pixel_matrix is not None: + font_data_list.append(font_info) + pixel_matrices.append(pixel_matrix) + + if (i + 1) % 50 == 0: + print(f"⚡ Processed {i + 1}/{len(png_files)} fonts...") + + print(f"✅ Loaded {len(font_data_list)} fonts successfully") + + # Convert to numpy array + pixel_matrices = np.array(pixel_matrices) + print(f"📊 Final matrix: {pixel_matrices.shape} ({pixel_matrices.shape[0]} fonts × {pixel_matrices.shape[1]} pixels)") + + return font_data_list, pixel_matrices + +def generate_umap_embedding(pixel_matrices): + """ + Generates UMAP embeddings from pixel matrices + + Args: + pixel_matrices: numpy array (n_fonts, 1600) + + Returns: + numpy.array: 2D UMAP coordinates + """ + print("🔄 Generating UMAP embeddings...") + + # Normalize data (important for UMAP) + print("📊 Normalizing data...") + scaler = StandardScaler() + normalized_data = scaler.fit_transform(pixel_matrices) + + # Apply UMAP + print(f"🗺️ Applying UMAP with parameters: {UMAP_PARAMS}") + reducer = umap.UMAP(**UMAP_PARAMS) + embedding = reducer.fit_transform(normalized_data) + + print(f"✅ UMAP completed - Embedding shape: {embedding.shape}") + print(f"📊 X range: [{embedding[:, 0].min():.2f}, {embedding[:, 0].max():.2f}]") + print(f"📊 Y range: [{embedding[:, 1].min():.2f}, {embedding[:, 1].max():.2f}]") + + return embedding + +def save_typography_data(font_data_list, embedding): + """ + Saves final data in JSON format + """ + print("💾 Saving data...") + + # Combine font data and UMAP coordinates + final_data = [] + for i, font_info in enumerate(font_data_list): + font_data = { + **font_info, + "x": float(embedding[i, 0]), + "y": float(embedding[i, 1]) + } + final_data.append(font_data) + + # Metadata + metadata = { + "generated_at": datetime.now().isoformat(), + "method": "umap_from_png_pixels", + "total_fonts": len(final_data), + "umap_params": UMAP_PARAMS, + "data_source": "PNG pixel matrices (40x40)" + } + + # Final structure + output_data = { + "metadata": metadata, + "fonts": final_data + } + + # Save + with open(FULL_OUTPUT_PATH, 'w', encoding='utf-8') as f: + json.dump(output_data, f, indent=2, ensure_ascii=False) + + print(f"✅ Data saved to {FULL_OUTPUT_PATH}") + + # Statistics by category + categories = {} + for font in final_data: + cat = font['family'] + categories[cat] = categories.get(cat, 0) + 1 + + print("\n📊 Distribution by category:") + for cat, count in sorted(categories.items()): + print(f" {cat}: {count} fonts") + +def main(): + """Main function""" + print("🎨 UMAP generation for typography from pixel matrices\n") + + try: + # 1. Load font data + font_data_list, pixel_matrices = load_all_font_data() + + # 2. Generate UMAP embeddings + embedding = generate_umap_embedding(pixel_matrices) + + # 3. Save results + save_typography_data(font_data_list, embedding) + + print("\n🎉 UMAP generation completed successfully!") + + except Exception as e: + print(f"💥 Fatal error: {e}") + raise + +if __name__ == "__main__": + main() \ No newline at end of file diff --git a/app/src/content/embeds/typography/5-generate-sprite.mjs b/app/src/content/embeds/typography/5-generate-sprite.mjs new file mode 100644 index 0000000000000000000000000000000000000000..fd2426db36a290174a3a570015bc3d027253422c --- /dev/null +++ b/app/src/content/embeds/typography/5-generate-sprite.mjs @@ -0,0 +1,108 @@ +#!/usr/bin/env node + +import fs from 'fs/promises'; +import path from 'path'; +import { fileURLToPath } from 'url'; + +const __dirname = path.dirname(fileURLToPath(import.meta.url)); + +// Configuration +const TYPOGRAPHY_BASE = __dirname; +const GENERATED_DIR = path.join(TYPOGRAPHY_BASE, 'generated'); +const SVGS_DIR = path.join(GENERATED_DIR, 'svgs'); +const DATA_DIR = path.join(GENERATED_DIR, 'data'); +const SPRITES_DIR = path.join(GENERATED_DIR, 'sprites'); +const OUTPUT_SPRITE = path.join(SPRITES_DIR, 'font-sprite.svg'); + +async function generateSvgSprite() { + console.log('🎨 Generating SVG sprite...'); + + try { + // Read all SVG files + const files = await fs.readdir(SVGS_DIR); + const svgFiles = files.filter(file => file.endsWith('.svg')); + + console.log(`📁 Found ${svgFiles.length} SVG files`); + + let sprites = []; + let processedCount = 0; + + // Process each SVG + for (const file of svgFiles) { + try { + const filePath = path.join(SVGS_DIR, file); + const content = await fs.readFile(filePath, 'utf-8'); + + // Extract SVG content (without tags) + const match = content.match(/]*>(.*?)<\/svg>/s); + if (!match) continue; + + const innerContent = match[1].trim(); + if (!innerContent) continue; + + // Create symbol ID from filename + const symbolId = file.replace('.svg', ''); + + // Extract viewBox if present + const viewBoxMatch = content.match(/viewBox=["']([^"']+)["']/); + const viewBox = viewBoxMatch ? viewBoxMatch[1] : '0 0 80 80'; + + // Create symbol + sprites.push(` + ${innerContent} + `); + + processedCount++; + + if (processedCount % 100 === 0) { + console.log(`⚡ Processed ${processedCount}/${svgFiles.length} SVGs...`); + } + + } catch (error) { + console.warn(`⚠️ Error with ${file}:`, error.message); + } + } + + // Create final sprite + const spriteContent = ` + + +${sprites.join('\n')} + +`; + + // Create output folder if necessary + await fs.mkdir(path.dirname(OUTPUT_SPRITE), { recursive: true }); + + // Write sprite file + await fs.writeFile(OUTPUT_SPRITE, spriteContent, 'utf-8'); + + console.log(`✅ SVG sprite generated with ${sprites.length} symbols`); + console.log(`📍 File: ${OUTPUT_SPRITE}`); + console.log(`📊 Size: ${(spriteContent.length / 1024).toFixed(1)} KB`); + + // Also generate mapping file for easier usage + const mapping = {}; + svgFiles.forEach(file => { + const fontName = file.replace('_a.svg', '').replace(/_/g, ' '); + const symbolId = file.replace('.svg', ''); + mapping[fontName] = symbolId; + }); + + const mappingFile = path.join(DATA_DIR, 'font-sprite-mapping.json'); + await fs.writeFile(mappingFile, JSON.stringify(mapping, null, 2)); + + console.log(`🗺️ Mapping generated: ${mappingFile}`); + + } catch (error) { + console.error('❌ Error during generation:', error); + process.exit(1); + } +} + +// Execute script +if (import.meta.url === `file://${process.argv[1]}`) { + generateSvgSprite(); +} + +export { generateSvgSprite }; diff --git a/app/src/content/embeds/typography/auto-continue-pipeline.sh b/app/src/content/embeds/typography/auto-continue-pipeline.sh new file mode 100755 index 0000000000000000000000000000000000000000..15f8c8190925792a89d686d47b9307a9957514d7 --- /dev/null +++ b/app/src/content/embeds/typography/auto-continue-pipeline.sh @@ -0,0 +1,90 @@ +#!/bin/bash + +# Script to monitor and automatically continue the pipeline + +echo "🔍 Pipeline monitoring in progress..." + +# Wait for step 1 to complete (check for presence of 300 fonts) +echo "⏳ Waiting for fonts download to complete..." + +while true; do + if [ -d "generated/fonts" ]; then + font_count=$(ls generated/fonts/*.ttf 2>/dev/null | wc -l) + echo "📈 Fonts downloaded: $font_count/300" + + if [ "$font_count" -ge 295 ]; then # We accept 295+ in case some fail + echo "✅ Download completed! Launching next steps..." + break + fi + else + echo "📁 generated/fonts directory not yet created..." + fi + + sleep 5 +done + +# Step 2: Generate SVGs +echo "" +echo "🎨 Step 2: SVG Generation..." +node 2-generate-svgs.mjs + +if [ $? -eq 0 ]; then + echo "✅ Step 2 completed successfully" +else + echo "❌ Step 2 Error" + exit 1 +fi + +# Step 3: Generate PNGs +echo "" +echo "🖼️ Step 3: Converting to PNGs..." +node 3-generate-pngs.mjs + +if [ $? -eq 0 ]; then + echo "✅ Step 3 completed successfully" +else + echo "❌ Step 3 Error" + exit 1 +fi + +# Step 4: Generate UMAP +echo "" +echo "🗺️ Step 4: UMAP Generation..." +poetry run python 4-generate-umap.py + +if [ $? -eq 0 ]; then + echo "✅ Step 4 completed successfully" +else + echo "❌ Step 4 Error" + exit 1 +fi + +# Step 5: Generate Sprite +echo "" +echo "🎯 Step 5: Sprite Generation..." +node 5-generate-sprite.mjs + +if [ $? -eq 0 ]; then + echo "✅ Step 5 completed successfully" + echo "" + echo "🎉 Complete pipeline finished successfully!" + + # Display final statistics + echo "" + echo "📊 Final Results:" + echo "📁 Fonts TTF: $(ls generated/fonts/*.ttf 2>/dev/null | wc -l)" + echo "🎨 SVGs: $(ls generated/svgs/*.svg 2>/dev/null | wc -l)" + echo "🖼️ PNGs: $(ls generated/pngs/*.png 2>/dev/null | wc -l)" + echo "📄 Data files:" + ls -la generated/data/ 2>/dev/null + + # Check manifest + if [ -f "generated/data/font_manifest.json" ]; then + manifest_count=$(jq 'keys | length' generated/data/font_manifest.json 2>/dev/null) + echo "📝 Fonts in manifest: $manifest_count" + fi + +else + echo "❌ Step 5 Error" + exit 1 +fi \ No newline at end of file diff --git a/app/src/content/embeds/typography/poetry.lock b/app/src/content/embeds/typography/poetry.lock new file mode 100644 index 0000000000000000000000000000000000000000..4e559eb06d969b2ef69fd380e266a1770f6f3c08 --- /dev/null +++ b/app/src/content/embeds/typography/poetry.lock @@ -0,0 +1,654 @@ +# This file is automatically @generated by Poetry 2.1.3 and should not be changed by hand. + +[[package]] +name = "colorama" +version = "0.4.6" +description = "Cross-platform colored terminal text." +optional = false +python-versions = "!=3.0.*,!=3.1.*,!=3.2.*,!=3.3.*,!=3.4.*,!=3.5.*,!=3.6.*,>=2.7" +groups = ["main"] +markers = "platform_system == \"Windows\"" +files = [ + {file = "colorama-0.4.6-py2.py3-none-any.whl", hash = "sha256:4f1d9991f5acc0ca119f9d443620b77f9d6b33703e51011c16baf57afb285fc6"}, + {file = "colorama-0.4.6.tar.gz", hash = "sha256:08695f5cb7ed6e0531a20572697297273c47b8cae5a63ffc6d6ed5c201be6e44"}, +] + +[[package]] +name = "joblib" +version = "1.5.2" +description = "Lightweight pipelining with Python functions" +optional = false +python-versions = ">=3.9" +groups = ["main"] +files = [ + {file = "joblib-1.5.2-py3-none-any.whl", hash = "sha256:4e1f0bdbb987e6d843c70cf43714cb276623def372df3c22fe5266b2670bc241"}, + {file = "joblib-1.5.2.tar.gz", hash = "sha256:3faa5c39054b2f03ca547da9b2f52fde67c06240c31853f306aea97f13647b55"}, +] + +[[package]] +name = "llvmlite" +version = "0.45.0" +description = "lightweight wrapper around basic LLVM functionality" +optional = false +python-versions = ">=3.10" +groups = ["main"] +files = [ + {file = "llvmlite-0.45.0-cp310-cp310-macosx_10_15_x86_64.whl", hash = "sha256:3018e5f8547c8b05e736281d5bd23ff86b88ab94697db2beeaa6f3bce9cfc721"}, + {file = "llvmlite-0.45.0-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:ca7b15dc4422551f1b5fb1dbd734d5e8a9416028890d31d4e23a04fbc8a975c4"}, + {file = "llvmlite-0.45.0-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:a9c7343bec403a79248859df75c7945768de70bf547eac8c1cc8b8840e0336ba"}, + {file = "llvmlite-0.45.0-cp310-cp310-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:56713a25bf81081fc818aa36cbffb70533b3c23291ce0efc17ac8a3b684b8be3"}, + {file = "llvmlite-0.45.0-cp310-cp310-win_amd64.whl", hash = "sha256:849ba7de7153d8d92bc66577bb951c9baf8d9f67f2521c4f39c78718d471362e"}, + {file = "llvmlite-0.45.0-cp311-cp311-macosx_10_15_x86_64.whl", hash = "sha256:9b1b37e00b553e9420d9a2e327e84c5ac65a5690dcacf7fc153014780d97532a"}, + {file = "llvmlite-0.45.0-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:cd039b8da5514db2729b7c9ae7526cae8da748a540fa3ab721b50c54651d2362"}, + {file = "llvmlite-0.45.0-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:c6815d0d3f96de34491d3dc192e11e933e3448ceff0b58572a53f39795996e01"}, + {file = "llvmlite-0.45.0-cp311-cp311-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:ba79cc2cbdd0f61632ca8e9235fef3657a8aacd636d5775cd13807ceb8265f63"}, + {file = "llvmlite-0.45.0-cp311-cp311-win_amd64.whl", hash = "sha256:6188da8e9e3906b167fb64bc84a05e6bf98095d982f45f323bed5def2ba7db1c"}, + {file = "llvmlite-0.45.0-cp312-cp312-macosx_10_15_x86_64.whl", hash = "sha256:3928119253849e7c9aad4f881feb3e886370bb7ac6eccbc728b35a1be89064cc"}, + {file = "llvmlite-0.45.0-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:a3e9b5dad694edb9e43904ede037458ee73a18b4e2f227e44fc0f808aceab824"}, + {file = "llvmlite-0.45.0-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:4955635f316e3ffc0271ee7a3da586ae92cd3e70709b6cd59df641e980636d4c"}, + {file = "llvmlite-0.45.0-cp312-cp312-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:5e7497f1b75d741e568bf4a2dfccd5c702d6b5f3d232dd4a59ed851a82e587bd"}, + {file = "llvmlite-0.45.0-cp312-cp312-win_amd64.whl", hash = "sha256:6404f5363986efbe1c7c1afd19da495534e46180466d593ace5a5c042b2f3f94"}, + {file = "llvmlite-0.45.0-cp313-cp313-macosx_10_15_x86_64.whl", hash = "sha256:f719f98e4f3a6292b1a6495500b2cf668d3604907499c483b326da5ce2ff9f01"}, + {file = "llvmlite-0.45.0-cp313-cp313-macosx_12_0_arm64.whl", hash = "sha256:4ffa899f7584ef48f1037308d92cb19460a0afb834aa1fe9db9d3e52d0e81a79"}, + {file = "llvmlite-0.45.0-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:2c12fde908967e464b265554143c030ba4dcc2b981a815582d7708a30295018e"}, + {file = "llvmlite-0.45.0-cp313-cp313-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:83567cbbf598eb57f108222dfc3dfee065c20a2aa004391360949f2e8ff2b8b4"}, + {file = "llvmlite-0.45.0-cp313-cp313-win_amd64.whl", hash = "sha256:f68890ceb662e874933103e91e239389ff7275c4befba8e43ccd46ae3231b89e"}, + {file = "llvmlite-0.45.0.tar.gz", hash = "sha256:ceb0bcd20da949178bd7ab78af8de73e9f3c483ac46b5bef39f06a4862aa8336"}, +] + +[[package]] +name = "numba" +version = "0.62.0" +description = "compiling Python code using LLVM" +optional = false +python-versions = ">=3.10" +groups = ["main"] +files = [ + {file = "numba-0.62.0-cp310-cp310-macosx_10_15_x86_64.whl", hash = "sha256:3e7eaff7ce35799de4dda09a4cfcf1bb204ad59be5fa29a1efc080c0a72eb6d6"}, + {file = "numba-0.62.0-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:a7694c45ddfe5c9a26d05cd2bf378e214ae2d5332601a3c89c94207eb4661166"}, + {file = "numba-0.62.0-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:c2f07c6e67e8f54dba62a46a3b72294c5f4333ff703eb8966576ef731cc8ecd7"}, + {file = "numba-0.62.0-cp310-cp310-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:7f77fadaa6592d2a6b9c35bcddc710b22dceca0af9a7037dbc61ff209eaddfa8"}, + {file = "numba-0.62.0-cp310-cp310-win_amd64.whl", hash = "sha256:77050a79f6bc19324c2f6f456c074a49d3de35c8124c91668054e9d62243ac99"}, + {file = "numba-0.62.0-cp311-cp311-macosx_10_15_x86_64.whl", hash = "sha256:1370708a54281e1dd3e4b73f423f88d3b34b64cf3f5fa0e460a1fbe6bd4e0f3f"}, + {file = "numba-0.62.0-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:6bd7032d6c1e771967fc1d07a499bb10ce1639662451fc0a86089fa8efc420e7"}, + {file = "numba-0.62.0-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:87cdc476ea1b2feefb7f893a648be2f1e7a04f671f355ac9bbeb007eaf039f8c"}, + {file = "numba-0.62.0-cp311-cp311-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:144a57e504a5423acfc91fcd3be4e6481cb0667ce0bcc6cd3e8bd43a735b58a4"}, + {file = "numba-0.62.0-cp311-cp311-win_amd64.whl", hash = "sha256:499b00e0bd95c83fedf1cbf349b7132a432a90292cbe2014eeaf482ce7c3b9f8"}, + {file = "numba-0.62.0-cp312-cp312-macosx_10_15_x86_64.whl", hash = "sha256:82edb589c9607ec2dbe0b2d34793d8c5104daf766277acc49ad7e179f8634fd2"}, + {file = "numba-0.62.0-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:469e042750d5a6aa6847dc89d64de5f0bfaf2208b6d442e4634de3318b7043de"}, + {file = "numba-0.62.0-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:2ad2dc2b3583f8f24f35c8ade7e215c44590c9aa757ccba640dd293297cb15bb"}, + {file = "numba-0.62.0-cp312-cp312-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:0266998a842074fc91bfc406dd91c8ee12c196ea834375af6174f62647ffd9b1"}, + {file = "numba-0.62.0-cp312-cp312-win_amd64.whl", hash = "sha256:cbc84e030548a5aad74971eb1a579f69edc7da961d89ef09a5ee1fe01c207795"}, + {file = "numba-0.62.0-cp313-cp313-macosx_10_15_x86_64.whl", hash = "sha256:07e76ac7bcd47156a758df52e9752fdfb94ff5f80b78c4710cabc568d8d3d6ad"}, + {file = "numba-0.62.0-cp313-cp313-macosx_12_0_arm64.whl", hash = "sha256:a972689dad64a7047f555d93ce829fe05ca2519ad0cf7af0071a64145c571039"}, + {file = "numba-0.62.0-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:f789b1f2997fc34b1b88fcc4481886dcd44afcffbd3e28affedce54aec7fdcc1"}, + {file = "numba-0.62.0-cp313-cp313-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:516525981f19f36d3a0bada0fb7479cf0bf925b5e389d03aac87f3758c5cfb9e"}, + {file = "numba-0.62.0-cp313-cp313-win_amd64.whl", hash = "sha256:591a9c485904f219a129b0493f89d27de24286fb66dd5a577b11edc62fc78db4"}, + {file = "numba-0.62.0.tar.gz", hash = "sha256:2afcc7899dc93fefecbb274a19c592170bc2dbfae02b00f83e305332a9857a5a"}, +] + +[package.dependencies] +llvmlite = "==0.45.*" +numpy = ">=1.22,<2.4" + +[[package]] +name = "numpy" +version = "2.3.3" +description = "Fundamental package for array computing in Python" +optional = false +python-versions = ">=3.11" +groups = ["main"] +files = [ + {file = "numpy-2.3.3-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:0ffc4f5caba7dfcbe944ed674b7eef683c7e94874046454bb79ed7ee0236f59d"}, + {file = "numpy-2.3.3-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:e7e946c7170858a0295f79a60214424caac2ffdb0063d4d79cb681f9aa0aa569"}, + {file = "numpy-2.3.3-cp311-cp311-macosx_14_0_arm64.whl", hash = "sha256:cd4260f64bc794c3390a63bf0728220dd1a68170c169088a1e0dfa2fde1be12f"}, + {file = "numpy-2.3.3-cp311-cp311-macosx_14_0_x86_64.whl", hash = "sha256:f0ddb4b96a87b6728df9362135e764eac3cfa674499943ebc44ce96c478ab125"}, + {file = "numpy-2.3.3-cp311-cp311-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:afd07d377f478344ec6ca2b8d4ca08ae8bd44706763d1efb56397de606393f48"}, + {file = "numpy-2.3.3-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:bc92a5dedcc53857249ca51ef29f5e5f2f8c513e22cfb90faeb20343b8c6f7a6"}, + {file = "numpy-2.3.3-cp311-cp311-musllinux_1_2_aarch64.whl", hash = "sha256:7af05ed4dc19f308e1d9fc759f36f21921eb7bbfc82843eeec6b2a2863a0aefa"}, + {file = "numpy-2.3.3-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:433bf137e338677cebdd5beac0199ac84712ad9d630b74eceeb759eaa45ddf30"}, + {file = "numpy-2.3.3-cp311-cp311-win32.whl", hash = "sha256:eb63d443d7b4ffd1e873f8155260d7f58e7e4b095961b01c91062935c2491e57"}, + {file = "numpy-2.3.3-cp311-cp311-win_amd64.whl", hash = "sha256:ec9d249840f6a565f58d8f913bccac2444235025bbb13e9a4681783572ee3caa"}, + {file = "numpy-2.3.3-cp311-cp311-win_arm64.whl", hash = "sha256:74c2a948d02f88c11a3c075d9733f1ae67d97c6bdb97f2bb542f980458b257e7"}, + {file = "numpy-2.3.3-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:cfdd09f9c84a1a934cde1eec2267f0a43a7cd44b2cca4ff95b7c0d14d144b0bf"}, + {file = "numpy-2.3.3-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:cb32e3cf0f762aee47ad1ddc6672988f7f27045b0783c887190545baba73aa25"}, + {file = "numpy-2.3.3-cp312-cp312-macosx_14_0_arm64.whl", hash = "sha256:396b254daeb0a57b1fe0ecb5e3cff6fa79a380fa97c8f7781a6d08cd429418fe"}, + {file = "numpy-2.3.3-cp312-cp312-macosx_14_0_x86_64.whl", hash = "sha256:067e3d7159a5d8f8a0b46ee11148fc35ca9b21f61e3c49fbd0a027450e65a33b"}, + {file = "numpy-2.3.3-cp312-cp312-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:1c02d0629d25d426585fb2e45a66154081b9fa677bc92a881ff1d216bc9919a8"}, + {file = "numpy-2.3.3-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:d9192da52b9745f7f0766531dcfa978b7763916f158bb63bdb8a1eca0068ab20"}, + {file = "numpy-2.3.3-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:cd7de500a5b66319db419dc3c345244404a164beae0d0937283b907d8152e6ea"}, + {file = "numpy-2.3.3-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:93d4962d8f82af58f0b2eb85daaf1b3ca23fe0a85d0be8f1f2b7bb46034e56d7"}, + {file = "numpy-2.3.3-cp312-cp312-win32.whl", hash = "sha256:5534ed6b92f9b7dca6c0a19d6df12d41c68b991cef051d108f6dbff3babc4ebf"}, + {file = "numpy-2.3.3-cp312-cp312-win_amd64.whl", hash = "sha256:497d7cad08e7092dba36e3d296fe4c97708c93daf26643a1ae4b03f6294d30eb"}, + {file = "numpy-2.3.3-cp312-cp312-win_arm64.whl", hash = "sha256:ca0309a18d4dfea6fc6262a66d06c26cfe4640c3926ceec90e57791a82b6eee5"}, + {file = "numpy-2.3.3-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:f5415fb78995644253370985342cd03572ef8620b934da27d77377a2285955bf"}, + {file = "numpy-2.3.3-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:d00de139a3324e26ed5b95870ce63be7ec7352171bc69a4cf1f157a48e3eb6b7"}, + {file = "numpy-2.3.3-cp313-cp313-macosx_14_0_arm64.whl", hash = "sha256:9dc13c6a5829610cc07422bc74d3ac083bd8323f14e2827d992f9e52e22cd6a6"}, + {file = "numpy-2.3.3-cp313-cp313-macosx_14_0_x86_64.whl", hash = "sha256:d79715d95f1894771eb4e60fb23f065663b2298f7d22945d66877aadf33d00c7"}, + {file = "numpy-2.3.3-cp313-cp313-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:952cfd0748514ea7c3afc729a0fc639e61655ce4c55ab9acfab14bda4f402b4c"}, + {file = "numpy-2.3.3-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:5b83648633d46f77039c29078751f80da65aa64d5622a3cd62aaef9d835b6c93"}, + {file = "numpy-2.3.3-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:b001bae8cea1c7dfdb2ae2b017ed0a6f2102d7a70059df1e338e307a4c78a8ae"}, + {file = "numpy-2.3.3-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:8e9aced64054739037d42fb84c54dd38b81ee238816c948c8f3ed134665dcd86"}, + {file = "numpy-2.3.3-cp313-cp313-win32.whl", hash = "sha256:9591e1221db3f37751e6442850429b3aabf7026d3b05542d102944ca7f00c8a8"}, + {file = "numpy-2.3.3-cp313-cp313-win_amd64.whl", hash = "sha256:f0dadeb302887f07431910f67a14d57209ed91130be0adea2f9793f1a4f817cf"}, + {file = "numpy-2.3.3-cp313-cp313-win_arm64.whl", hash = "sha256:3c7cf302ac6e0b76a64c4aecf1a09e51abd9b01fc7feee80f6c43e3ab1b1dbc5"}, + {file = "numpy-2.3.3-cp313-cp313t-macosx_10_13_x86_64.whl", hash = "sha256:eda59e44957d272846bb407aad19f89dc6f58fecf3504bd144f4c5cf81a7eacc"}, + {file = "numpy-2.3.3-cp313-cp313t-macosx_11_0_arm64.whl", hash = "sha256:823d04112bc85ef5c4fda73ba24e6096c8f869931405a80aa8b0e604510a26bc"}, + {file = "numpy-2.3.3-cp313-cp313t-macosx_14_0_arm64.whl", hash = "sha256:40051003e03db4041aa325da2a0971ba41cf65714e65d296397cc0e32de6018b"}, + {file = "numpy-2.3.3-cp313-cp313t-macosx_14_0_x86_64.whl", hash = "sha256:6ee9086235dd6ab7ae75aba5662f582a81ced49f0f1c6de4260a78d8f2d91a19"}, + {file = "numpy-2.3.3-cp313-cp313t-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:94fcaa68757c3e2e668ddadeaa86ab05499a70725811e582b6a9858dd472fb30"}, + {file = "numpy-2.3.3-cp313-cp313t-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:da1a74b90e7483d6ce5244053399a614b1d6b7bc30a60d2f570e5071f8959d3e"}, + {file = "numpy-2.3.3-cp313-cp313t-musllinux_1_2_aarch64.whl", hash = "sha256:2990adf06d1ecee3b3dcbb4977dfab6e9f09807598d647f04d385d29e7a3c3d3"}, + {file = "numpy-2.3.3-cp313-cp313t-musllinux_1_2_x86_64.whl", hash = "sha256:ed635ff692483b8e3f0fcaa8e7eb8a75ee71aa6d975388224f70821421800cea"}, + {file = "numpy-2.3.3-cp313-cp313t-win32.whl", hash = "sha256:a333b4ed33d8dc2b373cc955ca57babc00cd6f9009991d9edc5ddbc1bac36bcd"}, + {file = "numpy-2.3.3-cp313-cp313t-win_amd64.whl", hash = "sha256:4384a169c4d8f97195980815d6fcad04933a7e1ab3b530921c3fef7a1c63426d"}, + {file = "numpy-2.3.3-cp313-cp313t-win_arm64.whl", hash = "sha256:75370986cc0bc66f4ce5110ad35aae6d182cc4ce6433c40ad151f53690130bf1"}, + {file = "numpy-2.3.3-cp314-cp314-macosx_10_13_x86_64.whl", hash = "sha256:cd052f1fa6a78dee696b58a914b7229ecfa41f0a6d96dc663c1220a55e137593"}, + {file = "numpy-2.3.3-cp314-cp314-macosx_11_0_arm64.whl", hash = "sha256:414a97499480067d305fcac9716c29cf4d0d76db6ebf0bf3cbce666677f12652"}, + {file = "numpy-2.3.3-cp314-cp314-macosx_14_0_arm64.whl", hash = "sha256:50a5fe69f135f88a2be9b6ca0481a68a136f6febe1916e4920e12f1a34e708a7"}, + {file = "numpy-2.3.3-cp314-cp314-macosx_14_0_x86_64.whl", hash = "sha256:b912f2ed2b67a129e6a601e9d93d4fa37bef67e54cac442a2f588a54afe5c67a"}, + {file = "numpy-2.3.3-cp314-cp314-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:9e318ee0596d76d4cb3d78535dc005fa60e5ea348cd131a51e99d0bdbe0b54fe"}, + {file = "numpy-2.3.3-cp314-cp314-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:ce020080e4a52426202bdb6f7691c65bb55e49f261f31a8f506c9f6bc7450421"}, + {file = "numpy-2.3.3-cp314-cp314-musllinux_1_2_aarch64.whl", hash = "sha256:e6687dc183aa55dae4a705b35f9c0f8cb178bcaa2f029b241ac5356221d5c021"}, + {file = "numpy-2.3.3-cp314-cp314-musllinux_1_2_x86_64.whl", hash = "sha256:d8f3b1080782469fdc1718c4ed1d22549b5fb12af0d57d35e992158a772a37cf"}, + {file = "numpy-2.3.3-cp314-cp314-win32.whl", hash = "sha256:cb248499b0bc3be66ebd6578b83e5acacf1d6cb2a77f2248ce0e40fbec5a76d0"}, + {file = "numpy-2.3.3-cp314-cp314-win_amd64.whl", hash = "sha256:691808c2b26b0f002a032c73255d0bd89751425f379f7bcd22d140db593a96e8"}, + {file = "numpy-2.3.3-cp314-cp314-win_arm64.whl", hash = "sha256:9ad12e976ca7b10f1774b03615a2a4bab8addce37ecc77394d8e986927dc0dfe"}, + {file = "numpy-2.3.3-cp314-cp314t-macosx_10_13_x86_64.whl", hash = "sha256:9cc48e09feb11e1db00b320e9d30a4151f7369afb96bd0e48d942d09da3a0d00"}, + {file = "numpy-2.3.3-cp314-cp314t-macosx_11_0_arm64.whl", hash = "sha256:901bf6123879b7f251d3631967fd574690734236075082078e0571977c6a8e6a"}, + {file = "numpy-2.3.3-cp314-cp314t-macosx_14_0_arm64.whl", hash = "sha256:7f025652034199c301049296b59fa7d52c7e625017cae4c75d8662e377bf487d"}, + {file = "numpy-2.3.3-cp314-cp314t-macosx_14_0_x86_64.whl", hash = "sha256:533ca5f6d325c80b6007d4d7fb1984c303553534191024ec6a524a4c92a5935a"}, + {file = "numpy-2.3.3-cp314-cp314t-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:0edd58682a399824633b66885d699d7de982800053acf20be1eaa46d92009c54"}, + {file = "numpy-2.3.3-cp314-cp314t-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:367ad5d8fbec5d9296d18478804a530f1191e24ab4d75ab408346ae88045d25e"}, + {file = "numpy-2.3.3-cp314-cp314t-musllinux_1_2_aarch64.whl", hash = "sha256:8f6ac61a217437946a1fa48d24c47c91a0c4f725237871117dea264982128097"}, + {file = "numpy-2.3.3-cp314-cp314t-musllinux_1_2_x86_64.whl", hash = "sha256:179a42101b845a816d464b6fe9a845dfaf308fdfc7925387195570789bb2c970"}, + {file = "numpy-2.3.3-cp314-cp314t-win32.whl", hash = "sha256:1250c5d3d2562ec4174bce2e3a1523041595f9b651065e4a4473f5f48a6bc8a5"}, + {file = "numpy-2.3.3-cp314-cp314t-win_amd64.whl", hash = "sha256:b37a0b2e5935409daebe82c1e42274d30d9dd355852529eab91dab8dcca7419f"}, + {file = "numpy-2.3.3-cp314-cp314t-win_arm64.whl", hash = "sha256:78c9f6560dc7e6b3990e32df7ea1a50bbd0e2a111e05209963f5ddcab7073b0b"}, + {file = "numpy-2.3.3-pp311-pypy311_pp73-macosx_10_15_x86_64.whl", hash = "sha256:1e02c7159791cd481e1e6d5ddd766b62a4d5acf8df4d4d1afe35ee9c5c33a41e"}, + {file = "numpy-2.3.3-pp311-pypy311_pp73-macosx_11_0_arm64.whl", hash = "sha256:dca2d0fc80b3893ae72197b39f69d55a3cd8b17ea1b50aa4c62de82419936150"}, + {file = "numpy-2.3.3-pp311-pypy311_pp73-macosx_14_0_arm64.whl", hash = "sha256:99683cbe0658f8271b333a1b1b4bb3173750ad59c0c61f5bbdc5b318918fffe3"}, + {file = "numpy-2.3.3-pp311-pypy311_pp73-macosx_14_0_x86_64.whl", hash = "sha256:d9d537a39cc9de668e5cd0e25affb17aec17b577c6b3ae8a3d866b479fbe88d0"}, + {file = "numpy-2.3.3-pp311-pypy311_pp73-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:8596ba2f8af5f93b01d97563832686d20206d303024777f6dfc2e7c7c3f1850e"}, + {file = "numpy-2.3.3-pp311-pypy311_pp73-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:e1ec5615b05369925bd1125f27df33f3b6c8bc10d788d5999ecd8769a1fa04db"}, + {file = "numpy-2.3.3-pp311-pypy311_pp73-win_amd64.whl", hash = "sha256:2e267c7da5bf7309670523896df97f93f6e469fb931161f483cd6882b3b1a5dc"}, + {file = "numpy-2.3.3.tar.gz", hash = "sha256:ddc7c39727ba62b80dfdbedf400d1c10ddfa8eefbd7ec8dcb118be8b56d31029"}, +] + +[[package]] +name = "pandas" +version = "2.3.2" +description = "Powerful data structures for data analysis, time series, and statistics" +optional = false +python-versions = ">=3.9" +groups = ["main"] +files = [ + {file = "pandas-2.3.2-cp310-cp310-macosx_10_9_x86_64.whl", hash = "sha256:52bc29a946304c360561974c6542d1dd628ddafa69134a7131fdfd6a5d7a1a35"}, + {file = "pandas-2.3.2-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:220cc5c35ffaa764dd5bb17cf42df283b5cb7fdf49e10a7b053a06c9cb48ee2b"}, + {file = "pandas-2.3.2-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:42c05e15111221384019897df20c6fe893b2f697d03c811ee67ec9e0bb5a3424"}, + {file = "pandas-2.3.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:cc03acc273c5515ab69f898df99d9d4f12c4d70dbfc24c3acc6203751d0804cf"}, + {file = "pandas-2.3.2-cp310-cp310-musllinux_1_2_aarch64.whl", hash = "sha256:d25c20a03e8870f6339bcf67281b946bd20b86f1a544ebbebb87e66a8d642cba"}, + {file = "pandas-2.3.2-cp310-cp310-musllinux_1_2_x86_64.whl", hash = "sha256:21bb612d148bb5860b7eb2c10faacf1a810799245afd342cf297d7551513fbb6"}, + {file = "pandas-2.3.2-cp310-cp310-win_amd64.whl", hash = "sha256:b62d586eb25cb8cb70a5746a378fc3194cb7f11ea77170d59f889f5dfe3cec7a"}, + {file = "pandas-2.3.2-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:1333e9c299adcbb68ee89a9bb568fc3f20f9cbb419f1dd5225071e6cddb2a743"}, + {file = "pandas-2.3.2-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:76972bcbd7de8e91ad5f0ca884a9f2c477a2125354af624e022c49e5bd0dfff4"}, + {file = "pandas-2.3.2-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:b98bdd7c456a05eef7cd21fd6b29e3ca243591fe531c62be94a2cc987efb5ac2"}, + {file = "pandas-2.3.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:1d81573b3f7db40d020983f78721e9bfc425f411e616ef019a10ebf597aedb2e"}, + {file = "pandas-2.3.2-cp311-cp311-musllinux_1_2_aarch64.whl", hash = "sha256:e190b738675a73b581736cc8ec71ae113d6c3768d0bd18bffa5b9a0927b0b6ea"}, + {file = "pandas-2.3.2-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:c253828cb08f47488d60f43c5fc95114c771bbfff085da54bfc79cb4f9e3a372"}, + {file = "pandas-2.3.2-cp311-cp311-win_amd64.whl", hash = "sha256:9467697b8083f9667b212633ad6aa4ab32436dcbaf4cd57325debb0ddef2012f"}, + {file = "pandas-2.3.2-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:3fbb977f802156e7a3f829e9d1d5398f6192375a3e2d1a9ee0803e35fe70a2b9"}, + {file = "pandas-2.3.2-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:1b9b52693123dd234b7c985c68b709b0b009f4521000d0525f2b95c22f15944b"}, + {file = "pandas-2.3.2-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:0bd281310d4f412733f319a5bc552f86d62cddc5f51d2e392c8787335c994175"}, + {file = "pandas-2.3.2-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:96d31a6b4354e3b9b8a2c848af75d31da390657e3ac6f30c05c82068b9ed79b9"}, + {file = "pandas-2.3.2-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:df4df0b9d02bb873a106971bb85d448378ef14b86ba96f035f50bbd3688456b4"}, + {file = "pandas-2.3.2-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:213a5adf93d020b74327cb2c1b842884dbdd37f895f42dcc2f09d451d949f811"}, + {file = "pandas-2.3.2-cp312-cp312-win_amd64.whl", hash = "sha256:8c13b81a9347eb8c7548f53fd9a4f08d4dfe996836543f805c987bafa03317ae"}, + {file = "pandas-2.3.2-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:0c6ecbac99a354a051ef21c5307601093cb9e0f4b1855984a084bfec9302699e"}, + {file = "pandas-2.3.2-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:c6f048aa0fd080d6a06cc7e7537c09b53be6642d330ac6f54a600c3ace857ee9"}, + {file = "pandas-2.3.2-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:0064187b80a5be6f2f9c9d6bdde29372468751dfa89f4211a3c5871854cfbf7a"}, + {file = "pandas-2.3.2-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:4ac8c320bded4718b298281339c1a50fb00a6ba78cb2a63521c39bec95b0209b"}, + {file = "pandas-2.3.2-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:114c2fe4f4328cf98ce5716d1532f3ab79c5919f95a9cfee81d9140064a2e4d6"}, + {file = "pandas-2.3.2-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:48fa91c4dfb3b2b9bfdb5c24cd3567575f4e13f9636810462ffed8925352be5a"}, + {file = "pandas-2.3.2-cp313-cp313-win_amd64.whl", hash = "sha256:12d039facec710f7ba305786837d0225a3444af7bbd9c15c32ca2d40d157ed8b"}, + {file = "pandas-2.3.2-cp313-cp313t-macosx_10_13_x86_64.whl", hash = "sha256:c624b615ce97864eb588779ed4046186f967374185c047070545253a52ab2d57"}, + {file = "pandas-2.3.2-cp313-cp313t-macosx_11_0_arm64.whl", hash = "sha256:0cee69d583b9b128823d9514171cabb6861e09409af805b54459bd0c821a35c2"}, + {file = "pandas-2.3.2-cp313-cp313t-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:2319656ed81124982900b4c37f0e0c58c015af9a7bbc62342ba5ad07ace82ba9"}, + {file = "pandas-2.3.2-cp313-cp313t-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:b37205ad6f00d52f16b6d09f406434ba928c1a1966e2771006a9033c736d30d2"}, + {file = "pandas-2.3.2-cp313-cp313t-musllinux_1_2_aarch64.whl", hash = "sha256:837248b4fc3a9b83b9c6214699a13f069dc13510a6a6d7f9ba33145d2841a012"}, + {file = "pandas-2.3.2-cp313-cp313t-musllinux_1_2_x86_64.whl", hash = "sha256:d2c3554bd31b731cd6490d94a28f3abb8dd770634a9e06eb6d2911b9827db370"}, + {file = "pandas-2.3.2-cp39-cp39-macosx_10_9_x86_64.whl", hash = "sha256:88080a0ff8a55eac9c84e3ff3c7665b3b5476c6fbc484775ca1910ce1c3e0b87"}, + {file = "pandas-2.3.2-cp39-cp39-macosx_11_0_arm64.whl", hash = "sha256:d4a558c7620340a0931828d8065688b3cc5b4c8eb674bcaf33d18ff4a6870b4a"}, + {file = "pandas-2.3.2-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:45178cf09d1858a1509dc73ec261bf5b25a625a389b65be2e47b559905f0ab6a"}, + {file = "pandas-2.3.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:77cefe00e1b210f9c76c697fedd8fdb8d3dd86563e9c8adc9fa72b90f5e9e4c2"}, + {file = "pandas-2.3.2-cp39-cp39-musllinux_1_2_aarch64.whl", hash = "sha256:13bd629c653856f00c53dc495191baa59bcafbbf54860a46ecc50d3a88421a96"}, + {file = "pandas-2.3.2-cp39-cp39-musllinux_1_2_x86_64.whl", hash = "sha256:36d627906fd44b5fd63c943264e11e96e923f8de77d6016dc2f667b9ad193438"}, + {file = "pandas-2.3.2-cp39-cp39-win_amd64.whl", hash = "sha256:a9d7ec92d71a420185dec44909c32e9a362248c4ae2238234b76d5be37f208cc"}, + {file = "pandas-2.3.2.tar.gz", hash = "sha256:ab7b58f8f82706890924ccdfb5f48002b83d2b5a3845976a9fb705d36c34dcdb"}, +] + +[package.dependencies] +numpy = {version = ">=1.26.0", markers = "python_version >= \"3.12\""} +python-dateutil = ">=2.8.2" +pytz = ">=2020.1" +tzdata = ">=2022.7" + +[package.extras] +all = ["PyQt5 (>=5.15.9)", "SQLAlchemy (>=2.0.0)", "adbc-driver-postgresql (>=0.8.0)", "adbc-driver-sqlite (>=0.8.0)", "beautifulsoup4 (>=4.11.2)", "bottleneck (>=1.3.6)", "dataframe-api-compat (>=0.1.7)", "fastparquet (>=2022.12.0)", "fsspec (>=2022.11.0)", "gcsfs (>=2022.11.0)", "html5lib (>=1.1)", "hypothesis (>=6.46.1)", "jinja2 (>=3.1.2)", "lxml (>=4.9.2)", "matplotlib (>=3.6.3)", "numba (>=0.56.4)", "numexpr (>=2.8.4)", "odfpy (>=1.4.1)", "openpyxl (>=3.1.0)", "pandas-gbq (>=0.19.0)", "psycopg2 (>=2.9.6)", "pyarrow (>=10.0.1)", "pymysql (>=1.0.2)", "pyreadstat (>=1.2.0)", "pytest (>=7.3.2)", "pytest-xdist (>=2.2.0)", "python-calamine (>=0.1.7)", "pyxlsb (>=1.0.10)", "qtpy (>=2.3.0)", "s3fs (>=2022.11.0)", "scipy (>=1.10.0)", "tables (>=3.8.0)", "tabulate (>=0.9.0)", "xarray (>=2022.12.0)", "xlrd (>=2.0.1)", "xlsxwriter (>=3.0.5)", "zstandard (>=0.19.0)"] +aws = ["s3fs (>=2022.11.0)"] +clipboard = ["PyQt5 (>=5.15.9)", "qtpy (>=2.3.0)"] +compression = ["zstandard (>=0.19.0)"] +computation = ["scipy (>=1.10.0)", "xarray (>=2022.12.0)"] +consortium-standard = ["dataframe-api-compat (>=0.1.7)"] +excel = ["odfpy (>=1.4.1)", "openpyxl (>=3.1.0)", "python-calamine (>=0.1.7)", "pyxlsb (>=1.0.10)", "xlrd (>=2.0.1)", "xlsxwriter (>=3.0.5)"] +feather = ["pyarrow (>=10.0.1)"] +fss = ["fsspec (>=2022.11.0)"] +gcp = ["gcsfs (>=2022.11.0)", "pandas-gbq (>=0.19.0)"] +hdf5 = ["tables (>=3.8.0)"] +html = ["beautifulsoup4 (>=4.11.2)", "html5lib (>=1.1)", "lxml (>=4.9.2)"] +mysql = ["SQLAlchemy (>=2.0.0)", "pymysql (>=1.0.2)"] +output-formatting = ["jinja2 (>=3.1.2)", "tabulate (>=0.9.0)"] +parquet = ["pyarrow (>=10.0.1)"] +performance = ["bottleneck (>=1.3.6)", "numba (>=0.56.4)", "numexpr (>=2.8.4)"] +plot = ["matplotlib (>=3.6.3)"] +postgresql = ["SQLAlchemy (>=2.0.0)", "adbc-driver-postgresql (>=0.8.0)", "psycopg2 (>=2.9.6)"] +pyarrow = ["pyarrow (>=10.0.1)"] +spss = ["pyreadstat (>=1.2.0)"] +sql-other = ["SQLAlchemy (>=2.0.0)", "adbc-driver-postgresql (>=0.8.0)", "adbc-driver-sqlite (>=0.8.0)"] +test = ["hypothesis (>=6.46.1)", "pytest (>=7.3.2)", "pytest-xdist (>=2.2.0)"] +xml = ["lxml (>=4.9.2)"] + +[[package]] +name = "pillow" +version = "11.3.0" +description = "Python Imaging Library (Fork)" +optional = false +python-versions = ">=3.9" +groups = ["main"] +files = [ + {file = "pillow-11.3.0-cp310-cp310-macosx_10_10_x86_64.whl", hash = "sha256:1b9c17fd4ace828b3003dfd1e30bff24863e0eb59b535e8f80194d9cc7ecf860"}, + {file = "pillow-11.3.0-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:65dc69160114cdd0ca0f35cb434633c75e8e7fad4cf855177a05bf38678f73ad"}, + {file = "pillow-11.3.0-cp310-cp310-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:7107195ddc914f656c7fc8e4a5e1c25f32e9236ea3ea860f257b0436011fddd0"}, + {file = "pillow-11.3.0-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:cc3e831b563b3114baac7ec2ee86819eb03caa1a2cef0b481a5675b59c4fe23b"}, + {file = "pillow-11.3.0-cp310-cp310-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:f1f182ebd2303acf8c380a54f615ec883322593320a9b00438eb842c1f37ae50"}, + {file = "pillow-11.3.0-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:4445fa62e15936a028672fd48c4c11a66d641d2c05726c7ec1f8ba6a572036ae"}, + {file = "pillow-11.3.0-cp310-cp310-musllinux_1_2_aarch64.whl", hash = "sha256:71f511f6b3b91dd543282477be45a033e4845a40278fa8dcdbfdb07109bf18f9"}, + {file = "pillow-11.3.0-cp310-cp310-musllinux_1_2_x86_64.whl", hash = "sha256:040a5b691b0713e1f6cbe222e0f4f74cd233421e105850ae3b3c0ceda520f42e"}, + {file = "pillow-11.3.0-cp310-cp310-win32.whl", hash = "sha256:89bd777bc6624fe4115e9fac3352c79ed60f3bb18651420635f26e643e3dd1f6"}, + {file = "pillow-11.3.0-cp310-cp310-win_amd64.whl", hash = "sha256:19d2ff547c75b8e3ff46f4d9ef969a06c30ab2d4263a9e287733aa8b2429ce8f"}, + {file = "pillow-11.3.0-cp310-cp310-win_arm64.whl", hash = "sha256:819931d25e57b513242859ce1876c58c59dc31587847bf74cfe06b2e0cb22d2f"}, + {file = "pillow-11.3.0-cp311-cp311-macosx_10_10_x86_64.whl", hash = "sha256:1cd110edf822773368b396281a2293aeb91c90a2db00d78ea43e7e861631b722"}, + {file = "pillow-11.3.0-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:9c412fddd1b77a75aa904615ebaa6001f169b26fd467b4be93aded278266b288"}, + {file = "pillow-11.3.0-cp311-cp311-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:7d1aa4de119a0ecac0a34a9c8bde33f34022e2e8f99104e47a3ca392fd60e37d"}, + {file = "pillow-11.3.0-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:91da1d88226663594e3f6b4b8c3c8d85bd504117d043740a8e0ec449087cc494"}, + {file = "pillow-11.3.0-cp311-cp311-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:643f189248837533073c405ec2f0bb250ba54598cf80e8c1e043381a60632f58"}, + {file = "pillow-11.3.0-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:106064daa23a745510dabce1d84f29137a37224831d88eb4ce94bb187b1d7e5f"}, + {file = "pillow-11.3.0-cp311-cp311-musllinux_1_2_aarch64.whl", hash = "sha256:cd8ff254faf15591e724dc7c4ddb6bf4793efcbe13802a4ae3e863cd300b493e"}, + {file = "pillow-11.3.0-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:932c754c2d51ad2b2271fd01c3d121daaa35e27efae2a616f77bf164bc0b3e94"}, + {file = "pillow-11.3.0-cp311-cp311-win32.whl", hash = "sha256:b4b8f3efc8d530a1544e5962bd6b403d5f7fe8b9e08227c6b255f98ad82b4ba0"}, + {file = "pillow-11.3.0-cp311-cp311-win_amd64.whl", hash = "sha256:1a992e86b0dd7aeb1f053cd506508c0999d710a8f07b4c791c63843fc6a807ac"}, + {file = "pillow-11.3.0-cp311-cp311-win_arm64.whl", hash = "sha256:30807c931ff7c095620fe04448e2c2fc673fcbb1ffe2a7da3fb39613489b1ddd"}, + {file = "pillow-11.3.0-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:fdae223722da47b024b867c1ea0be64e0df702c5e0a60e27daad39bf960dd1e4"}, + {file = "pillow-11.3.0-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:921bd305b10e82b4d1f5e802b6850677f965d8394203d182f078873851dada69"}, + {file = "pillow-11.3.0-cp312-cp312-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:eb76541cba2f958032d79d143b98a3a6b3ea87f0959bbe256c0b5e416599fd5d"}, + {file = "pillow-11.3.0-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:67172f2944ebba3d4a7b54f2e95c786a3a50c21b88456329314caaa28cda70f6"}, + {file = "pillow-11.3.0-cp312-cp312-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:97f07ed9f56a3b9b5f49d3661dc9607484e85c67e27f3e8be2c7d28ca032fec7"}, + {file = "pillow-11.3.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:676b2815362456b5b3216b4fd5bd89d362100dc6f4945154ff172e206a22c024"}, + {file = "pillow-11.3.0-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:3e184b2f26ff146363dd07bde8b711833d7b0202e27d13540bfe2e35a323a809"}, + {file = "pillow-11.3.0-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:6be31e3fc9a621e071bc17bb7de63b85cbe0bfae91bb0363c893cbe67247780d"}, + {file = "pillow-11.3.0-cp312-cp312-win32.whl", hash = "sha256:7b161756381f0918e05e7cb8a371fff367e807770f8fe92ecb20d905d0e1c149"}, + {file = "pillow-11.3.0-cp312-cp312-win_amd64.whl", hash = "sha256:a6444696fce635783440b7f7a9fc24b3ad10a9ea3f0ab66c5905be1c19ccf17d"}, + {file = "pillow-11.3.0-cp312-cp312-win_arm64.whl", hash = "sha256:2aceea54f957dd4448264f9bf40875da0415c83eb85f55069d89c0ed436e3542"}, + {file = "pillow-11.3.0-cp313-cp313-ios_13_0_arm64_iphoneos.whl", hash = "sha256:1c627742b539bba4309df89171356fcb3cc5a9178355b2727d1b74a6cf155fbd"}, + {file = "pillow-11.3.0-cp313-cp313-ios_13_0_arm64_iphonesimulator.whl", hash = "sha256:30b7c02f3899d10f13d7a48163c8969e4e653f8b43416d23d13d1bbfdc93b9f8"}, + {file = "pillow-11.3.0-cp313-cp313-ios_13_0_x86_64_iphonesimulator.whl", hash = "sha256:7859a4cc7c9295f5838015d8cc0a9c215b77e43d07a25e460f35cf516df8626f"}, + {file = "pillow-11.3.0-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:ec1ee50470b0d050984394423d96325b744d55c701a439d2bd66089bff963d3c"}, + {file = "pillow-11.3.0-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:7db51d222548ccfd274e4572fdbf3e810a5e66b00608862f947b163e613b67dd"}, + {file = "pillow-11.3.0-cp313-cp313-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:2d6fcc902a24ac74495df63faad1884282239265c6839a0a6416d33faedfae7e"}, + {file = "pillow-11.3.0-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:f0f5d8f4a08090c6d6d578351a2b91acf519a54986c055af27e7a93feae6d3f1"}, + {file = "pillow-11.3.0-cp313-cp313-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:c37d8ba9411d6003bba9e518db0db0c58a680ab9fe5179f040b0463644bc9805"}, + {file = "pillow-11.3.0-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:13f87d581e71d9189ab21fe0efb5a23e9f28552d5be6979e84001d3b8505abe8"}, + {file = "pillow-11.3.0-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:023f6d2d11784a465f09fd09a34b150ea4672e85fb3d05931d89f373ab14abb2"}, + {file = "pillow-11.3.0-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:45dfc51ac5975b938e9809451c51734124e73b04d0f0ac621649821a63852e7b"}, + {file = "pillow-11.3.0-cp313-cp313-win32.whl", hash = "sha256:a4d336baed65d50d37b88ca5b60c0fa9d81e3a87d4a7930d3880d1624d5b31f3"}, + {file = "pillow-11.3.0-cp313-cp313-win_amd64.whl", hash = "sha256:0bce5c4fd0921f99d2e858dc4d4d64193407e1b99478bc5cacecba2311abde51"}, + {file = "pillow-11.3.0-cp313-cp313-win_arm64.whl", hash = "sha256:1904e1264881f682f02b7f8167935cce37bc97db457f8e7849dc3a6a52b99580"}, + {file = "pillow-11.3.0-cp313-cp313t-macosx_10_13_x86_64.whl", hash = "sha256:4c834a3921375c48ee6b9624061076bc0a32a60b5532b322cc0ea64e639dd50e"}, + {file = "pillow-11.3.0-cp313-cp313t-macosx_11_0_arm64.whl", hash = "sha256:5e05688ccef30ea69b9317a9ead994b93975104a677a36a8ed8106be9260aa6d"}, + {file = "pillow-11.3.0-cp313-cp313t-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:1019b04af07fc0163e2810167918cb5add8d74674b6267616021ab558dc98ced"}, + {file = "pillow-11.3.0-cp313-cp313t-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:f944255db153ebb2b19c51fe85dd99ef0ce494123f21b9db4877ffdfc5590c7c"}, + {file = "pillow-11.3.0-cp313-cp313t-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:1f85acb69adf2aaee8b7da124efebbdb959a104db34d3a2cb0f3793dbae422a8"}, + {file = "pillow-11.3.0-cp313-cp313t-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:05f6ecbeff5005399bb48d198f098a9b4b6bdf27b8487c7f38ca16eeb070cd59"}, + {file = "pillow-11.3.0-cp313-cp313t-musllinux_1_2_aarch64.whl", hash = "sha256:a7bc6e6fd0395bc052f16b1a8670859964dbd7003bd0af2ff08342eb6e442cfe"}, + {file = "pillow-11.3.0-cp313-cp313t-musllinux_1_2_x86_64.whl", hash = "sha256:83e1b0161c9d148125083a35c1c5a89db5b7054834fd4387499e06552035236c"}, + {file = "pillow-11.3.0-cp313-cp313t-win32.whl", hash = "sha256:2a3117c06b8fb646639dce83694f2f9eac405472713fcb1ae887469c0d4f6788"}, + {file = "pillow-11.3.0-cp313-cp313t-win_amd64.whl", hash = "sha256:857844335c95bea93fb39e0fa2726b4d9d758850b34075a7e3ff4f4fa3aa3b31"}, + {file = "pillow-11.3.0-cp313-cp313t-win_arm64.whl", hash = "sha256:8797edc41f3e8536ae4b10897ee2f637235c94f27404cac7297f7b607dd0716e"}, + {file = "pillow-11.3.0-cp314-cp314-macosx_10_13_x86_64.whl", hash = "sha256:d9da3df5f9ea2a89b81bb6087177fb1f4d1c7146d583a3fe5c672c0d94e55e12"}, + {file = "pillow-11.3.0-cp314-cp314-macosx_11_0_arm64.whl", hash = "sha256:0b275ff9b04df7b640c59ec5a3cb113eefd3795a8df80bac69646ef699c6981a"}, + {file = "pillow-11.3.0-cp314-cp314-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:0743841cabd3dba6a83f38a92672cccbd69af56e3e91777b0ee7f4dba4385632"}, + {file = "pillow-11.3.0-cp314-cp314-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:2465a69cf967b8b49ee1b96d76718cd98c4e925414ead59fdf75cf0fd07df673"}, + {file = "pillow-11.3.0-cp314-cp314-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:41742638139424703b4d01665b807c6468e23e699e8e90cffefe291c5832b027"}, + {file = "pillow-11.3.0-cp314-cp314-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:93efb0b4de7e340d99057415c749175e24c8864302369e05914682ba642e5d77"}, + {file = "pillow-11.3.0-cp314-cp314-musllinux_1_2_aarch64.whl", hash = "sha256:7966e38dcd0fa11ca390aed7c6f20454443581d758242023cf36fcb319b1a874"}, + {file = "pillow-11.3.0-cp314-cp314-musllinux_1_2_x86_64.whl", hash = "sha256:98a9afa7b9007c67ed84c57c9e0ad86a6000da96eaa638e4f8abe5b65ff83f0a"}, + {file = "pillow-11.3.0-cp314-cp314-win32.whl", hash = "sha256:02a723e6bf909e7cea0dac1b0e0310be9d7650cd66222a5f1c571455c0a45214"}, + {file = "pillow-11.3.0-cp314-cp314-win_amd64.whl", hash = "sha256:a418486160228f64dd9e9efcd132679b7a02a5f22c982c78b6fc7dab3fefb635"}, + {file = "pillow-11.3.0-cp314-cp314-win_arm64.whl", hash = "sha256:155658efb5e044669c08896c0c44231c5e9abcaadbc5cd3648df2f7c0b96b9a6"}, + {file = "pillow-11.3.0-cp314-cp314t-macosx_10_13_x86_64.whl", hash = "sha256:59a03cdf019efbfeeed910bf79c7c93255c3d54bc45898ac2a4140071b02b4ae"}, + {file = "pillow-11.3.0-cp314-cp314t-macosx_11_0_arm64.whl", hash = "sha256:f8a5827f84d973d8636e9dc5764af4f0cf2318d26744b3d902931701b0d46653"}, + {file = "pillow-11.3.0-cp314-cp314t-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:ee92f2fd10f4adc4b43d07ec5e779932b4eb3dbfbc34790ada5a6669bc095aa6"}, + {file = "pillow-11.3.0-cp314-cp314t-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:c96d333dcf42d01f47b37e0979b6bd73ec91eae18614864622d9b87bbd5bbf36"}, + {file = "pillow-11.3.0-cp314-cp314t-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:4c96f993ab8c98460cd0c001447bff6194403e8b1d7e149ade5f00594918128b"}, + {file = "pillow-11.3.0-cp314-cp314t-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:41342b64afeba938edb034d122b2dda5db2139b9a4af999729ba8818e0056477"}, + {file = "pillow-11.3.0-cp314-cp314t-musllinux_1_2_aarch64.whl", hash = "sha256:068d9c39a2d1b358eb9f245ce7ab1b5c3246c7c8c7d9ba58cfa5b43146c06e50"}, + {file = "pillow-11.3.0-cp314-cp314t-musllinux_1_2_x86_64.whl", hash = "sha256:a1bc6ba083b145187f648b667e05a2534ecc4b9f2784c2cbe3089e44868f2b9b"}, + {file = "pillow-11.3.0-cp314-cp314t-win32.whl", hash = "sha256:118ca10c0d60b06d006be10a501fd6bbdfef559251ed31b794668ed569c87e12"}, + {file = "pillow-11.3.0-cp314-cp314t-win_amd64.whl", hash = "sha256:8924748b688aa210d79883357d102cd64690e56b923a186f35a82cbc10f997db"}, + {file = "pillow-11.3.0-cp314-cp314t-win_arm64.whl", hash = "sha256:79ea0d14d3ebad43ec77ad5272e6ff9bba5b679ef73375ea760261207fa8e0aa"}, + {file = "pillow-11.3.0-cp39-cp39-macosx_10_10_x86_64.whl", hash = "sha256:48d254f8a4c776de343051023eb61ffe818299eeac478da55227d96e241de53f"}, + {file = "pillow-11.3.0-cp39-cp39-macosx_11_0_arm64.whl", hash = "sha256:7aee118e30a4cf54fdd873bd3a29de51e29105ab11f9aad8c32123f58c8f8081"}, + {file = "pillow-11.3.0-cp39-cp39-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:23cff760a9049c502721bdb743a7cb3e03365fafcdfc2ef9784610714166e5a4"}, + {file = "pillow-11.3.0-cp39-cp39-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:6359a3bc43f57d5b375d1ad54a0074318a0844d11b76abccf478c37c986d3cfc"}, + {file = "pillow-11.3.0-cp39-cp39-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:092c80c76635f5ecb10f3f83d76716165c96f5229addbd1ec2bdbbda7d496e06"}, + {file = "pillow-11.3.0-cp39-cp39-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:cadc9e0ea0a2431124cde7e1697106471fc4c1da01530e679b2391c37d3fbb3a"}, + {file = "pillow-11.3.0-cp39-cp39-musllinux_1_2_aarch64.whl", hash = "sha256:6a418691000f2a418c9135a7cf0d797c1bb7d9a485e61fe8e7722845b95ef978"}, + {file = "pillow-11.3.0-cp39-cp39-musllinux_1_2_x86_64.whl", hash = "sha256:97afb3a00b65cc0804d1c7abddbf090a81eaac02768af58cbdcaaa0a931e0b6d"}, + {file = "pillow-11.3.0-cp39-cp39-win32.whl", hash = "sha256:ea944117a7974ae78059fcc1800e5d3295172bb97035c0c1d9345fca1419da71"}, + {file = "pillow-11.3.0-cp39-cp39-win_amd64.whl", hash = "sha256:e5c5858ad8ec655450a7c7df532e9842cf8df7cc349df7225c60d5d348c8aada"}, + {file = "pillow-11.3.0-cp39-cp39-win_arm64.whl", hash = "sha256:6abdbfd3aea42be05702a8dd98832329c167ee84400a1d1f61ab11437f1717eb"}, + {file = "pillow-11.3.0-pp310-pypy310_pp73-macosx_10_15_x86_64.whl", hash = "sha256:3cee80663f29e3843b68199b9d6f4f54bd1d4a6b59bdd91bceefc51238bcb967"}, + {file = "pillow-11.3.0-pp310-pypy310_pp73-macosx_11_0_arm64.whl", hash = "sha256:b5f56c3f344f2ccaf0dd875d3e180f631dc60a51b314295a3e681fe8cf851fbe"}, + {file = "pillow-11.3.0-pp310-pypy310_pp73-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:e67d793d180c9df62f1f40aee3accca4829d3794c95098887edc18af4b8b780c"}, + {file = "pillow-11.3.0-pp310-pypy310_pp73-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:d000f46e2917c705e9fb93a3606ee4a819d1e3aa7a9b442f6444f07e77cf5e25"}, + {file = "pillow-11.3.0-pp310-pypy310_pp73-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:527b37216b6ac3a12d7838dc3bd75208ec57c1c6d11ef01902266a5a0c14fc27"}, + {file = "pillow-11.3.0-pp310-pypy310_pp73-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:be5463ac478b623b9dd3937afd7fb7ab3d79dd290a28e2b6df292dc75063eb8a"}, + {file = "pillow-11.3.0-pp310-pypy310_pp73-win_amd64.whl", hash = "sha256:8dc70ca24c110503e16918a658b869019126ecfe03109b754c402daff12b3d9f"}, + {file = "pillow-11.3.0-pp311-pypy311_pp73-macosx_10_15_x86_64.whl", hash = "sha256:7c8ec7a017ad1bd562f93dbd8505763e688d388cde6e4a010ae1486916e713e6"}, + {file = "pillow-11.3.0-pp311-pypy311_pp73-macosx_11_0_arm64.whl", hash = "sha256:9ab6ae226de48019caa8074894544af5b53a117ccb9d3b3dcb2871464c829438"}, + {file = "pillow-11.3.0-pp311-pypy311_pp73-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:fe27fb049cdcca11f11a7bfda64043c37b30e6b91f10cb5bab275806c32f6ab3"}, + {file = "pillow-11.3.0-pp311-pypy311_pp73-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:465b9e8844e3c3519a983d58b80be3f668e2a7a5db97f2784e7079fbc9f9822c"}, + {file = "pillow-11.3.0-pp311-pypy311_pp73-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:5418b53c0d59b3824d05e029669efa023bbef0f3e92e75ec8428f3799487f361"}, + {file = "pillow-11.3.0-pp311-pypy311_pp73-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:504b6f59505f08ae014f724b6207ff6222662aab5cc9542577fb084ed0676ac7"}, + {file = "pillow-11.3.0-pp311-pypy311_pp73-win_amd64.whl", hash = "sha256:c84d689db21a1c397d001aa08241044aa2069e7587b398c8cc63020390b1c1b8"}, + {file = "pillow-11.3.0.tar.gz", hash = "sha256:3828ee7586cd0b2091b6209e5ad53e20d0649bbe87164a459d0676e035e8f523"}, +] + +[package.extras] +docs = ["furo", "olefile", "sphinx (>=8.2)", "sphinx-autobuild", "sphinx-copybutton", "sphinx-inline-tabs", "sphinxext-opengraph"] +fpx = ["olefile"] +mic = ["olefile"] +test-arrow = ["pyarrow"] +tests = ["check-manifest", "coverage (>=7.4.2)", "defusedxml", "markdown2", "olefile", "packaging", "pyroma", "pytest", "pytest-cov", "pytest-timeout", "pytest-xdist", "trove-classifiers (>=2024.10.12)"] +typing = ["typing-extensions ; python_version < \"3.10\""] +xmp = ["defusedxml"] + +[[package]] +name = "pynndescent" +version = "0.5.13" +description = "Nearest Neighbor Descent" +optional = false +python-versions = "*" +groups = ["main"] +files = [ + {file = "pynndescent-0.5.13-py3-none-any.whl", hash = "sha256:69aabb8f394bc631b6ac475a1c7f3994c54adf3f51cd63b2730fefba5771b949"}, + {file = "pynndescent-0.5.13.tar.gz", hash = "sha256:d74254c0ee0a1eeec84597d5fe89fedcf778593eeabe32c2f97412934a9800fb"}, +] + +[package.dependencies] +joblib = ">=0.11" +llvmlite = ">=0.30" +numba = ">=0.51.2" +scikit-learn = ">=0.18" +scipy = ">=1.0" + +[[package]] +name = "python-dateutil" +version = "2.9.0.post0" +description = "Extensions to the standard Python datetime module" +optional = false +python-versions = "!=3.0.*,!=3.1.*,!=3.2.*,>=2.7" +groups = ["main"] +files = [ + {file = "python-dateutil-2.9.0.post0.tar.gz", hash = "sha256:37dd54208da7e1cd875388217d5e00ebd4179249f90fb72437e91a35459a0ad3"}, + {file = "python_dateutil-2.9.0.post0-py2.py3-none-any.whl", hash = "sha256:a8b2bc7bffae282281c8140a97d3aa9c14da0b136dfe83f850eea9a5f7470427"}, +] + +[package.dependencies] +six = ">=1.5" + +[[package]] +name = "pytz" +version = "2025.2" +description = "World timezone definitions, modern and historical" +optional = false +python-versions = "*" +groups = ["main"] +files = [ + {file = "pytz-2025.2-py2.py3-none-any.whl", hash = "sha256:5ddf76296dd8c44c26eb8f4b6f35488f3ccbf6fbbd7adee0b7262d43f0ec2f00"}, + {file = "pytz-2025.2.tar.gz", hash = "sha256:360b9e3dbb49a209c21ad61809c7fb453643e048b38924c765813546746e81c3"}, +] + +[[package]] +name = "scikit-learn" +version = "1.7.2" +description = "A set of python modules for machine learning and data mining" +optional = false +python-versions = ">=3.10" +groups = ["main"] +files = [ + {file = "scikit_learn-1.7.2-cp310-cp310-macosx_10_9_x86_64.whl", hash = "sha256:6b33579c10a3081d076ab403df4a4190da4f4432d443521674637677dc91e61f"}, + {file = "scikit_learn-1.7.2-cp310-cp310-macosx_12_0_arm64.whl", hash = "sha256:36749fb62b3d961b1ce4fedf08fa57a1986cd409eff2d783bca5d4b9b5fce51c"}, + {file = "scikit_learn-1.7.2-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:7a58814265dfc52b3295b1900cfb5701589d30a8bb026c7540f1e9d3499d5ec8"}, + {file = "scikit_learn-1.7.2-cp310-cp310-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:4a847fea807e278f821a0406ca01e387f97653e284ecbd9750e3ee7c90347f18"}, + {file = "scikit_learn-1.7.2-cp310-cp310-win_amd64.whl", hash = "sha256:ca250e6836d10e6f402436d6463d6c0e4d8e0234cfb6a9a47835bd392b852ce5"}, + {file = "scikit_learn-1.7.2-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:c7509693451651cd7361d30ce4e86a1347493554f172b1c72a39300fa2aea79e"}, + {file = "scikit_learn-1.7.2-cp311-cp311-macosx_12_0_arm64.whl", hash = "sha256:0486c8f827c2e7b64837c731c8feff72c0bd2b998067a8a9cbc10643c31f0fe1"}, + {file = "scikit_learn-1.7.2-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:89877e19a80c7b11a2891a27c21c4894fb18e2c2e077815bcade10d34287b20d"}, + {file = "scikit_learn-1.7.2-cp311-cp311-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:8da8bf89d4d79aaec192d2bda62f9b56ae4e5b4ef93b6a56b5de4977e375c1f1"}, + {file = "scikit_learn-1.7.2-cp311-cp311-win_amd64.whl", hash = "sha256:9b7ed8d58725030568523e937c43e56bc01cadb478fc43c042a9aca1dacb3ba1"}, + {file = "scikit_learn-1.7.2-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:8d91a97fa2b706943822398ab943cde71858a50245e31bc71dba62aab1d60a96"}, + {file = "scikit_learn-1.7.2-cp312-cp312-macosx_12_0_arm64.whl", hash = "sha256:acbc0f5fd2edd3432a22c69bed78e837c70cf896cd7993d71d51ba6708507476"}, + {file = "scikit_learn-1.7.2-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:e5bf3d930aee75a65478df91ac1225ff89cd28e9ac7bd1196853a9229b6adb0b"}, + {file = "scikit_learn-1.7.2-cp312-cp312-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:b4d6e9deed1a47aca9fe2f267ab8e8fe82ee20b4526b2c0cd9e135cea10feb44"}, + {file = "scikit_learn-1.7.2-cp312-cp312-win_amd64.whl", hash = "sha256:6088aa475f0785e01bcf8529f55280a3d7d298679f50c0bb70a2364a82d0b290"}, + {file = "scikit_learn-1.7.2-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:0b7dacaa05e5d76759fb071558a8b5130f4845166d88654a0f9bdf3eb57851b7"}, + {file = "scikit_learn-1.7.2-cp313-cp313-macosx_12_0_arm64.whl", hash = "sha256:abebbd61ad9e1deed54cca45caea8ad5f79e1b93173dece40bb8e0c658dbe6fe"}, + {file = "scikit_learn-1.7.2-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:502c18e39849c0ea1a5d681af1dbcf15f6cce601aebb657aabbfe84133c1907f"}, + {file = "scikit_learn-1.7.2-cp313-cp313-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:7a4c328a71785382fe3fe676a9ecf2c86189249beff90bf85e22bdb7efaf9ae0"}, + {file = "scikit_learn-1.7.2-cp313-cp313-win_amd64.whl", hash = "sha256:63a9afd6f7b229aad94618c01c252ce9e6fa97918c5ca19c9a17a087d819440c"}, + {file = "scikit_learn-1.7.2-cp313-cp313t-macosx_10_13_x86_64.whl", hash = "sha256:9acb6c5e867447b4e1390930e3944a005e2cb115922e693c08a323421a6966e8"}, + {file = "scikit_learn-1.7.2-cp313-cp313t-macosx_12_0_arm64.whl", hash = "sha256:2a41e2a0ef45063e654152ec9d8bcfc39f7afce35b08902bfe290c2498a67a6a"}, + {file = "scikit_learn-1.7.2-cp313-cp313t-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:98335fb98509b73385b3ab2bd0639b1f610541d3988ee675c670371d6a87aa7c"}, + {file = "scikit_learn-1.7.2-cp313-cp313t-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:191e5550980d45449126e23ed1d5e9e24b2c68329ee1f691a3987476e115e09c"}, + {file = "scikit_learn-1.7.2-cp313-cp313t-win_amd64.whl", hash = "sha256:57dc4deb1d3762c75d685507fbd0bc17160144b2f2ba4ccea5dc285ab0d0e973"}, + {file = "scikit_learn-1.7.2-cp314-cp314-macosx_10_13_x86_64.whl", hash = "sha256:fa8f63940e29c82d1e67a45d5297bdebbcb585f5a5a50c4914cc2e852ab77f33"}, + {file = "scikit_learn-1.7.2-cp314-cp314-macosx_12_0_arm64.whl", hash = "sha256:f95dc55b7902b91331fa4e5845dd5bde0580c9cd9612b1b2791b7e80c3d32615"}, + {file = "scikit_learn-1.7.2-cp314-cp314-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:9656e4a53e54578ad10a434dc1f993330568cfee176dff07112b8785fb413106"}, + {file = "scikit_learn-1.7.2-cp314-cp314-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:96dc05a854add0e50d3f47a1ef21a10a595016da5b007c7d9cd9d0bffd1fcc61"}, + {file = "scikit_learn-1.7.2-cp314-cp314-win_amd64.whl", hash = "sha256:bb24510ed3f9f61476181e4db51ce801e2ba37541def12dc9333b946fc7a9cf8"}, + {file = "scikit_learn-1.7.2.tar.gz", hash = "sha256:20e9e49ecd130598f1ca38a1d85090e1a600147b9c02fa6f15d69cb53d968fda"}, +] + +[package.dependencies] +joblib = ">=1.2.0" +numpy = ">=1.22.0" +scipy = ">=1.8.0" +threadpoolctl = ">=3.1.0" + +[package.extras] +benchmark = ["matplotlib (>=3.5.0)", "memory_profiler (>=0.57.0)", "pandas (>=1.4.0)"] +build = ["cython (>=3.0.10)", "meson-python (>=0.17.1)", "numpy (>=1.22.0)", "scipy (>=1.8.0)"] +docs = ["Pillow (>=8.4.0)", "matplotlib (>=3.5.0)", "memory_profiler (>=0.57.0)", "numpydoc (>=1.2.0)", "pandas (>=1.4.0)", "plotly (>=5.14.0)", "polars (>=0.20.30)", "pooch (>=1.6.0)", "pydata-sphinx-theme (>=0.15.3)", "scikit-image (>=0.19.0)", "seaborn (>=0.9.0)", "sphinx (>=7.3.7)", "sphinx-copybutton (>=0.5.2)", "sphinx-design (>=0.5.0)", "sphinx-design (>=0.6.0)", "sphinx-gallery (>=0.17.1)", "sphinx-prompt (>=1.4.0)", "sphinx-remove-toctrees (>=1.0.0.post1)", "sphinxcontrib-sass (>=0.3.4)", "sphinxext-opengraph (>=0.9.1)", "towncrier (>=24.8.0)"] +examples = ["matplotlib (>=3.5.0)", "pandas (>=1.4.0)", "plotly (>=5.14.0)", "pooch (>=1.6.0)", "scikit-image (>=0.19.0)", "seaborn (>=0.9.0)"] +install = ["joblib (>=1.2.0)", "numpy (>=1.22.0)", "scipy (>=1.8.0)", "threadpoolctl (>=3.1.0)"] +maintenance = ["conda-lock (==3.0.1)"] +tests = ["matplotlib (>=3.5.0)", "mypy (>=1.15)", "numpydoc (>=1.2.0)", "pandas (>=1.4.0)", "polars (>=0.20.30)", "pooch (>=1.6.0)", "pyamg (>=4.2.1)", "pyarrow (>=12.0.0)", "pytest (>=7.1.2)", "pytest-cov (>=2.9.0)", "ruff (>=0.11.7)", "scikit-image (>=0.19.0)"] + +[[package]] +name = "scipy" +version = "1.16.2" +description = "Fundamental algorithms for scientific computing in Python" +optional = false +python-versions = ">=3.11" +groups = ["main"] +files = [ + {file = "scipy-1.16.2-cp311-cp311-macosx_10_14_x86_64.whl", hash = "sha256:6ab88ea43a57da1af33292ebd04b417e8e2eaf9d5aa05700be8d6e1b6501cd92"}, + {file = "scipy-1.16.2-cp311-cp311-macosx_12_0_arm64.whl", hash = "sha256:c95e96c7305c96ede73a7389f46ccd6c659c4da5ef1b2789466baeaed3622b6e"}, + {file = "scipy-1.16.2-cp311-cp311-macosx_14_0_arm64.whl", hash = "sha256:87eb178db04ece7c698220d523c170125dbffebb7af0345e66c3554f6f60c173"}, + {file = "scipy-1.16.2-cp311-cp311-macosx_14_0_x86_64.whl", hash = "sha256:4e409eac067dcee96a57fbcf424c13f428037827ec7ee3cb671ff525ca4fc34d"}, + {file = "scipy-1.16.2-cp311-cp311-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:e574be127bb760f0dad24ff6e217c80213d153058372362ccb9555a10fc5e8d2"}, + {file = "scipy-1.16.2-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:f5db5ba6188d698ba7abab982ad6973265b74bb40a1efe1821b58c87f73892b9"}, + {file = "scipy-1.16.2-cp311-cp311-musllinux_1_2_aarch64.whl", hash = "sha256:ec6e74c4e884104ae006d34110677bfe0098203a3fec2f3faf349f4cb05165e3"}, + {file = "scipy-1.16.2-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:912f46667d2d3834bc3d57361f854226475f695eb08c08a904aadb1c936b6a88"}, + {file = "scipy-1.16.2-cp311-cp311-win_amd64.whl", hash = "sha256:91e9e8a37befa5a69e9cacbe0bcb79ae5afb4a0b130fd6db6ee6cc0d491695fa"}, + {file = "scipy-1.16.2-cp311-cp311-win_arm64.whl", hash = "sha256:f3bf75a6dcecab62afde4d1f973f1692be013110cad5338007927db8da73249c"}, + {file = "scipy-1.16.2-cp312-cp312-macosx_10_14_x86_64.whl", hash = "sha256:89d6c100fa5c48472047632e06f0876b3c4931aac1f4291afc81a3644316bb0d"}, + {file = "scipy-1.16.2-cp312-cp312-macosx_12_0_arm64.whl", hash = "sha256:ca748936cd579d3f01928b30a17dc474550b01272d8046e3e1ee593f23620371"}, + {file = "scipy-1.16.2-cp312-cp312-macosx_14_0_arm64.whl", hash = "sha256:fac4f8ce2ddb40e2e3d0f7ec36d2a1e7f92559a2471e59aec37bd8d9de01fec0"}, + {file = "scipy-1.16.2-cp312-cp312-macosx_14_0_x86_64.whl", hash = "sha256:033570f1dcefd79547a88e18bccacff025c8c647a330381064f561d43b821232"}, + {file = "scipy-1.16.2-cp312-cp312-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:ea3421209bf00c8a5ef2227de496601087d8f638a2363ee09af059bd70976dc1"}, + {file = "scipy-1.16.2-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:f66bd07ba6f84cd4a380b41d1bf3c59ea488b590a2ff96744845163309ee8e2f"}, + {file = "scipy-1.16.2-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:5e9feab931bd2aea4a23388c962df6468af3d808ddf2d40f94a81c5dc38f32ef"}, + {file = "scipy-1.16.2-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:03dfc75e52f72cf23ec2ced468645321407faad8f0fe7b1f5b49264adbc29cb1"}, + {file = "scipy-1.16.2-cp312-cp312-win_amd64.whl", hash = "sha256:0ce54e07bbb394b417457409a64fd015be623f36e330ac49306433ffe04bc97e"}, + {file = "scipy-1.16.2-cp312-cp312-win_arm64.whl", hash = "sha256:2a8ffaa4ac0df81a0b94577b18ee079f13fecdb924df3328fc44a7dc5ac46851"}, + {file = "scipy-1.16.2-cp313-cp313-macosx_10_14_x86_64.whl", hash = "sha256:84f7bf944b43e20b8a894f5fe593976926744f6c185bacfcbdfbb62736b5cc70"}, + {file = "scipy-1.16.2-cp313-cp313-macosx_12_0_arm64.whl", hash = "sha256:5c39026d12edc826a1ef2ad35ad1e6d7f087f934bb868fc43fa3049c8b8508f9"}, + {file = "scipy-1.16.2-cp313-cp313-macosx_14_0_arm64.whl", hash = "sha256:e52729ffd45b68777c5319560014d6fd251294200625d9d70fd8626516fc49f5"}, + {file = "scipy-1.16.2-cp313-cp313-macosx_14_0_x86_64.whl", hash = "sha256:024dd4a118cccec09ca3209b7e8e614931a6ffb804b2a601839499cb88bdf925"}, + {file = "scipy-1.16.2-cp313-cp313-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:7a5dc7ee9c33019973a470556081b0fd3c9f4c44019191039f9769183141a4d9"}, + {file = "scipy-1.16.2-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:c2275ff105e508942f99d4e3bc56b6ef5e4b3c0af970386ca56b777608ce95b7"}, + {file = "scipy-1.16.2-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:af80196eaa84f033e48444d2e0786ec47d328ba00c71e4299b602235ffef9acb"}, + {file = "scipy-1.16.2-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:9fb1eb735fe3d6ed1f89918224e3385fbf6f9e23757cacc35f9c78d3b712dd6e"}, + {file = "scipy-1.16.2-cp313-cp313-win_amd64.whl", hash = "sha256:fda714cf45ba43c9d3bae8f2585c777f64e3f89a2e073b668b32ede412d8f52c"}, + {file = "scipy-1.16.2-cp313-cp313-win_arm64.whl", hash = "sha256:2f5350da923ccfd0b00e07c3e5cfb316c1c0d6c1d864c07a72d092e9f20db104"}, + {file = "scipy-1.16.2-cp313-cp313t-macosx_10_14_x86_64.whl", hash = "sha256:53d8d2ee29b925344c13bda64ab51785f016b1b9617849dac10897f0701b20c1"}, + {file = "scipy-1.16.2-cp313-cp313t-macosx_12_0_arm64.whl", hash = "sha256:9e05e33657efb4c6a9d23bd8300101536abd99c85cca82da0bffff8d8764d08a"}, + {file = "scipy-1.16.2-cp313-cp313t-macosx_14_0_arm64.whl", hash = "sha256:7fe65b36036357003b3ef9d37547abeefaa353b237e989c21027b8ed62b12d4f"}, + {file = "scipy-1.16.2-cp313-cp313t-macosx_14_0_x86_64.whl", hash = "sha256:6406d2ac6d40b861cccf57f49592f9779071655e9f75cd4f977fa0bdd09cb2e4"}, + {file = "scipy-1.16.2-cp313-cp313t-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:ff4dc42bd321991fbf611c23fc35912d690f731c9914bf3af8f417e64aca0f21"}, + {file = "scipy-1.16.2-cp313-cp313t-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:654324826654d4d9133e10675325708fb954bc84dae6e9ad0a52e75c6b1a01d7"}, + {file = "scipy-1.16.2-cp313-cp313t-musllinux_1_2_aarch64.whl", hash = "sha256:63870a84cd15c44e65220eaed2dac0e8f8b26bbb991456a033c1d9abfe8a94f8"}, + {file = "scipy-1.16.2-cp313-cp313t-musllinux_1_2_x86_64.whl", hash = "sha256:fa01f0f6a3050fa6a9771a95d5faccc8e2f5a92b4a2e5440a0fa7264a2398472"}, + {file = "scipy-1.16.2-cp313-cp313t-win_amd64.whl", hash = "sha256:116296e89fba96f76353a8579820c2512f6e55835d3fad7780fece04367de351"}, + {file = "scipy-1.16.2-cp313-cp313t-win_arm64.whl", hash = "sha256:98e22834650be81d42982360382b43b17f7ba95e0e6993e2a4f5b9ad9283a94d"}, + {file = "scipy-1.16.2-cp314-cp314-macosx_10_14_x86_64.whl", hash = "sha256:567e77755019bb7461513c87f02bb73fb65b11f049aaaa8ca17cfaa5a5c45d77"}, + {file = "scipy-1.16.2-cp314-cp314-macosx_12_0_arm64.whl", hash = "sha256:17d9bb346194e8967296621208fcdfd39b55498ef7d2f376884d5ac47cec1a70"}, + {file = "scipy-1.16.2-cp314-cp314-macosx_14_0_arm64.whl", hash = "sha256:0a17541827a9b78b777d33b623a6dcfe2ef4a25806204d08ead0768f4e529a88"}, + {file = "scipy-1.16.2-cp314-cp314-macosx_14_0_x86_64.whl", hash = "sha256:d7d4c6ba016ffc0f9568d012f5f1eb77ddd99412aea121e6fa8b4c3b7cbad91f"}, + {file = "scipy-1.16.2-cp314-cp314-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:9702c4c023227785c779cba2e1d6f7635dbb5b2e0936cdd3a4ecb98d78fd41eb"}, + {file = "scipy-1.16.2-cp314-cp314-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:d1cdf0ac28948d225decdefcc45ad7dd91716c29ab56ef32f8e0d50657dffcc7"}, + {file = "scipy-1.16.2-cp314-cp314-musllinux_1_2_aarch64.whl", hash = "sha256:70327d6aa572a17c2941cdfb20673f82e536e91850a2e4cb0c5b858b690e1548"}, + {file = "scipy-1.16.2-cp314-cp314-musllinux_1_2_x86_64.whl", hash = "sha256:5221c0b2a4b58aa7c4ed0387d360fd90ee9086d383bb34d9f2789fafddc8a936"}, + {file = "scipy-1.16.2-cp314-cp314-win_amd64.whl", hash = "sha256:f5a85d7b2b708025af08f060a496dd261055b617d776fc05a1a1cc69e09fe9ff"}, + {file = "scipy-1.16.2-cp314-cp314-win_arm64.whl", hash = "sha256:2cc73a33305b4b24556957d5857d6253ce1e2dcd67fa0ff46d87d1670b3e1e1d"}, + {file = "scipy-1.16.2-cp314-cp314t-macosx_10_14_x86_64.whl", hash = "sha256:9ea2a3fed83065d77367775d689401a703d0f697420719ee10c0780bcab594d8"}, + {file = "scipy-1.16.2-cp314-cp314t-macosx_12_0_arm64.whl", hash = "sha256:7280d926f11ca945c3ef92ba960fa924e1465f8d07ce3a9923080363390624c4"}, + {file = "scipy-1.16.2-cp314-cp314t-macosx_14_0_arm64.whl", hash = "sha256:8afae1756f6a1fe04636407ef7dbece33d826a5d462b74f3d0eb82deabefd831"}, + {file = "scipy-1.16.2-cp314-cp314t-macosx_14_0_x86_64.whl", hash = "sha256:5c66511f29aa8d233388e7416a3f20d5cae7a2744d5cee2ecd38c081f4e861b3"}, + {file = "scipy-1.16.2-cp314-cp314t-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:efe6305aeaa0e96b0ccca5ff647a43737d9a092064a3894e46c414db84bc54ac"}, + {file = "scipy-1.16.2-cp314-cp314t-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:7f3a337d9ae06a1e8d655ee9d8ecb835ea5ddcdcbd8d23012afa055ab014f374"}, + {file = "scipy-1.16.2-cp314-cp314t-musllinux_1_2_aarch64.whl", hash = "sha256:bab3605795d269067d8ce78a910220262711b753de8913d3deeaedb5dded3bb6"}, + {file = "scipy-1.16.2-cp314-cp314t-musllinux_1_2_x86_64.whl", hash = "sha256:b0348d8ddb55be2a844c518cd8cc8deeeb8aeba707cf834db5758fc89b476a2c"}, + {file = "scipy-1.16.2-cp314-cp314t-win_amd64.whl", hash = "sha256:26284797e38b8a75e14ea6631d29bda11e76ceaa6ddb6fdebbfe4c4d90faf2f9"}, + {file = "scipy-1.16.2-cp314-cp314t-win_arm64.whl", hash = "sha256:d2a4472c231328d4de38d5f1f68fdd6d28a615138f842580a8a321b5845cf779"}, + {file = "scipy-1.16.2.tar.gz", hash = "sha256:af029b153d243a80afb6eabe40b0a07f8e35c9adc269c019f364ad747f826a6b"}, +] + +[package.dependencies] +numpy = ">=1.25.2,<2.6" + +[package.extras] +dev = ["cython-lint (>=0.12.2)", "doit (>=0.36.0)", "mypy (==1.10.0)", "pycodestyle", "pydevtool", "rich-click", "ruff (>=0.0.292)", "types-psutil", "typing_extensions"] +doc = ["intersphinx_registry", "jupyterlite-pyodide-kernel", "jupyterlite-sphinx (>=0.19.1)", "jupytext", "linkify-it-py", "matplotlib (>=3.5)", "myst-nb (>=1.2.0)", "numpydoc", "pooch", "pydata-sphinx-theme (>=0.15.2)", "sphinx (>=5.0.0,<8.2.0)", "sphinx-copybutton", "sphinx-design (>=0.4.0)"] +test = ["Cython", "array-api-strict (>=2.3.1)", "asv", "gmpy2", "hypothesis (>=6.30)", "meson", "mpmath", "ninja ; sys_platform != \"emscripten\"", "pooch", "pytest (>=8.0.0)", "pytest-cov", "pytest-timeout", "pytest-xdist", "scikit-umfpack", "threadpoolctl"] + +[[package]] +name = "six" +version = "1.17.0" +description = "Python 2 and 3 compatibility utilities" +optional = false +python-versions = "!=3.0.*,!=3.1.*,!=3.2.*,>=2.7" +groups = ["main"] +files = [ + {file = "six-1.17.0-py2.py3-none-any.whl", hash = "sha256:4721f391ed90541fddacab5acf947aa0d3dc7d27b2e1e8eda2be8970586c3274"}, + {file = "six-1.17.0.tar.gz", hash = "sha256:ff70335d468e7eb6ec65b95b99d3a2836546063f63acc5171de367e834932a81"}, +] + +[[package]] +name = "threadpoolctl" +version = "3.6.0" +description = "threadpoolctl" +optional = false +python-versions = ">=3.9" +groups = ["main"] +files = [ + {file = "threadpoolctl-3.6.0-py3-none-any.whl", hash = "sha256:43a0b8fd5a2928500110039e43a5eed8480b918967083ea48dc3ab9f13c4a7fb"}, + {file = "threadpoolctl-3.6.0.tar.gz", hash = "sha256:8ab8b4aa3491d812b623328249fab5302a68d2d71745c8a4c719a2fcaba9f44e"}, +] + +[[package]] +name = "tqdm" +version = "4.67.1" +description = "Fast, Extensible Progress Meter" +optional = false +python-versions = ">=3.7" +groups = ["main"] +files = [ + {file = "tqdm-4.67.1-py3-none-any.whl", hash = "sha256:26445eca388f82e72884e0d580d5464cd801a3ea01e63e5601bdff9ba6a48de2"}, + {file = "tqdm-4.67.1.tar.gz", hash = "sha256:f8aef9c52c08c13a65f30ea34f4e5aac3fd1a34959879d7e59e63027286627f2"}, +] + +[package.dependencies] +colorama = {version = "*", markers = "platform_system == \"Windows\""} + +[package.extras] +dev = ["nbval", "pytest (>=6)", "pytest-asyncio (>=0.24)", "pytest-cov", "pytest-timeout"] +discord = ["requests"] +notebook = ["ipywidgets (>=6)"] +slack = ["slack-sdk"] +telegram = ["requests"] + +[[package]] +name = "tzdata" +version = "2025.2" +description = "Provider of IANA time zone data" +optional = false +python-versions = ">=2" +groups = ["main"] +files = [ + {file = "tzdata-2025.2-py2.py3-none-any.whl", hash = "sha256:1a403fada01ff9221ca8044d701868fa132215d84beb92242d9acd2147f667a8"}, + {file = "tzdata-2025.2.tar.gz", hash = "sha256:b60a638fcc0daffadf82fe0f57e53d06bdec2f36c4df66280ae79bce6bd6f2b9"}, +] + +[[package]] +name = "umap-learn" +version = "0.5.9.post2" +description = "Uniform Manifold Approximation and Projection" +optional = false +python-versions = ">=3.9" +groups = ["main"] +files = [ + {file = "umap_learn-0.5.9.post2-py3-none-any.whl", hash = "sha256:fbe51166561e0e7fab00ef3d516ac2621243b8d15cf4bef9f656d701736b16a0"}, + {file = "umap_learn-0.5.9.post2.tar.gz", hash = "sha256:bdf60462d779bd074ce177a0714ced17e6d161285590fa487f3f9548dd3c31c9"}, +] + +[package.dependencies] +numba = ">=0.51.2" +numpy = ">=1.23" +pynndescent = ">=0.5" +scikit-learn = ">=1.6" +scipy = ">=1.3.1" +tqdm = "*" + +[package.extras] +parametric-umap = ["tensorflow (>=2.1)"] +plot = ["bokeh", "colorcet", "dask", "datashader", "holoviews", "matplotlib", "pandas", "scikit-image", "seaborn"] +tbb = ["tbb (>=2019.0)"] +test = ["pytest"] + +[metadata] +lock-version = "2.1" +python-versions = ">=3.13" +content-hash = "e3ffd60ecb4ec7f7c207eea1fb1b29231aaeddf55d9f7276ea1784d98dd14d15" diff --git a/app/src/content/embeds/typography/pyproject.toml b/app/src/content/embeds/typography/pyproject.toml new file mode 100644 index 0000000000000000000000000000000000000000..943bb593602b0cc77dbb7ff668d033becbaafcc7 --- /dev/null +++ b/app/src/content/embeds/typography/pyproject.toml @@ -0,0 +1,21 @@ +[project] +name = "typography-umap" +version = "0.1.0" +description = "" +authors = [ + {name = "Your Name",email = "you@example.com"} +] +readme = "README.md" +requires-python = ">=3.13" +dependencies = [ + "umap-learn (>=0.5.9.post2,<0.6.0)", + "pillow (>=11.3.0,<12.0.0)", + "scikit-learn (>=1.7.2,<2.0.0)", + "numpy (>=2.3.3,<3.0.0)", + "pandas (>=2.3.2,<3.0.0)" +] + + +[build-system] +requires = ["poetry-core>=2.0.0,<3.0.0"] +build-backend = "poetry.core.masonry.api" diff --git a/app/src/content/embeds/typography/run-full-pipeline.sh b/app/src/content/embeds/typography/run-full-pipeline.sh new file mode 100644 index 0000000000000000000000000000000000000000..0813c9232b9cddd3612052e7c8ce22b6bcac9196 --- /dev/null +++ b/app/src/content/embeds/typography/run-full-pipeline.sh @@ -0,0 +1,56 @@ +#!/bin/bash + +echo "🚀 Starting full typography pipeline for 300 fonts..." + +# Step 1: Download fonts (already running) +echo "Step 1: Downloading fonts... (in progress)" + +# Wait for step 1 to complete, then run remaining steps +echo "Step 2: Generating SVGs..." +node 2-generate-svgs.mjs + +if [ $? -eq 0 ]; then + echo "✅ Step 2 completed successfully" + + echo "Step 3: Converting to PNGs..." + node 3-generate-pngs.mjs + + if [ $? -eq 0 ]; then + echo "✅ Step 3 completed successfully" + + echo "Step 4: Generating UMAP analysis..." + poetry run python 4-generate-umap.py + + if [ $? -eq 0 ]; then + echo "✅ Step 4 completed successfully" + + echo "Step 5: Generating sprite..." + node 5-generate-sprite.mjs + + if [ $? -eq 0 ]; then + echo "✅ Step 5 completed successfully" + echo "🎉 Full pipeline completed with 300 fonts!" + + # Display final stats + echo "📊 Final results:" + echo "📁 Fonts: $(ls generated/fonts/ | wc -l) TTF files" + echo "🎨 SVGs: $(ls generated/svgs/ | wc -l) SVG files" + echo "🖼️ PNGs: $(ls generated/pngs/ | wc -l) PNG files" + echo "📄 Data files:" + ls -la generated/data/ + else + echo "❌ Step 5 failed" + exit 1 + fi + else + echo "❌ Step 4 failed" + exit 1 + fi + else + echo "❌ Step 3 failed" + exit 1 + fi +else + echo "❌ Step 2 failed" + exit 1 +fi \ No newline at end of file diff --git a/app/src/content/embeds/vibe-code-d3-embeds-directives.md b/app/src/content/embeds/vibe-code-d3-embeds-directives.md new file mode 100644 index 0000000000000000000000000000000000000000..329833c8a753e2ee29aff29f8edd12da5d4481c8 --- /dev/null +++ b/app/src/content/embeds/vibe-code-d3-embeds-directives.md @@ -0,0 +1,504 @@ +## Embed Chart Authoring Guidelines + +### Quickstart (TL;DR) +- Create a single self-contained HTML fragment: root div + scoped style + IIFE script. +- Draw marks/axes in SVG; render UI (legend and controls) in HTML. +- Place legend and controls BELOW the chart (header appended after the chart). Include a legend title "Legend" and a select labeled "Metric" when relevant. +- Load data from public `/data` first, then fall back to `assets/data`. +- Use `window.ColorPalettes` for colors; stick to CSS variables for theming. + +Minimal header markup: +```html +
        +
        Legend
        +
        + +
        +
        +
        + + +
        + +
        +``` + +See also: `d3-line-simple.html`, `d3-line-quad.html`, `d3-benchmark.html`. + +Authoring rules for creating a new interactive chart as a single self-contained `.html` file under `src/content/embeds/`. These conventions are derived from `d3-bar.html`, `d3-comparison.html`, `d3-neural.html`, `d3-line.html`, and `d3-pie.html`. + +### A) Colors & palettes (MANDATORY) +- Always obtain color arrays from `window.ColorPalettes`; do not hardcode palettes. +- Use the categorical/sequential/diverging helpers and the current primary color. +- If you change `--primary-color` dynamically, call `window.ColorPalettes.refresh()` so listeners update. + +Usage: +```js +// Usage (with explicit counts) +const cat = window.ColorPalettes.getColors('categorical', 8); +const seq = window.ColorPalettes.getColors('sequential', 8); +const div = window.ColorPalettes.getColors('diverging', 7); + +// For current primary color string +const primaryHex = window.ColorPalettes.getPrimary(); + +// If you change --primary-color dynamically, call refresh to notify listeners +document.documentElement.style.setProperty('--primary-color', '#6D4AFF'); +window.ColorPalettes.refresh(); +``` + +Notes: +- Keep chart accents (lines, markers, selection) aligned with `--primary-color`. +- Prefer CSS variables for fills/strokes when possible; derive series colors via `ColorPalettes`. +- Provide a graceful fallback to CSS variables if `window.ColorPalettes` is unavailable. + +### B) Layout & form elements (HTML-only) +- All UI controls (labels, selects, sliders, buttons, toggles) must be plain HTML inside the root container. +- Do not draw controls with SVG; style them consistently (rounded 8px, custom caret, focus ring). +- Use `