Research Metrics Pipeline Overview: This pipeline enriches an EPrints export with metrics from OpenAlex and Scopus, writing out a consolidated CSV while supporting parallel requests, throttling, streaming checkpoints, and resume-from-existing behavior. Prerequisites: - Python 3.x with pandas installed. - Working folder: C:\ResearchMetrics - Files required: * research_metrics_pipeline_parallel_patched_email.py * eprints_export.csv - Scopus API key set as SCOPUS_API_KEY in environment. Configuration: - BASE_DIR = C:\ResearchMetrics - INPUT_CSV = eprints_export.csv - FINAL_CSV = final_metrics_report.csv - BACKUP_DIR = backup - OpenAlex email: (add email for polite request) - Concurrency: OpenAlex workers=8, Scopus workers=3 - Resume enabled: skips DOIs already processed. How to Run: Option 1: Double-click run_research_metrics.bat (based on run_research_metrics.txt) Option 2: From CMD: cd /d C:\ResearchMetrics set SCOPUS_API_KEY=YOUR_KEY python research_metrics_pipeline_parallel_patched_email.py Output: - final_metrics_report.csv with original EPrints fields + OpenAlex and Scopus metrics. - Backup files saved in backup folder if write errors occur. Resume Behavior: - If final_metrics_report.csv exists, script skips processed DOIs and preserves prior metrics. Performance: - Parallel OpenAlex fetch with retries. - Parallel Scopus fetch with global throttling. - Streaming writes every 100 rows. Troubleshooting: - Ensure SCOPUS_API_KEY is set. - Ensure DOI column exists in input CSV. - Close output file if locked. Customization: - Adjust BATCH_SIZE and CHECKPOINT_SIZE. - Tune concurrency and delays. - Update OpenAlex email and BASE_DIR as needed. Notes: - Keep Scopus API key private. - OpenAlex uses email for polite API access.