Research Metrics Pipeline Overview: This pipeline enriches an EPrints export with metrics from OpenAlex (via API key) and Scopus, writing out a consolidated CSV while supporting parallel requests, throttling, streaming checkpoints, and resume-from-existing behavior. Prerequisites: - Python 3.x with pandas installed. - Working folder: C:\ResearchMetrics - Files required: * research_metrics_pipeline_parallel_v3.py * eprints_export.csv - Scopus API key set as SCOPUS_API_KEY in environment. - OpenAlex API Key: (New) Set as OPENALEX_API_KEY in environment to enable high-rate limits (100k/day). Configuration: - BASE_DIR = C:\ResearchMetrics - INPUT_CSV = eprints_export.csv - FINAL_CSV = final_metrics_report.csv - OpenAlex Identification: Uses both OPENALEX_API_KEY (for auth) and YOUR_EMAIL (for the polite pool/User-Agent). - Concurrency: OpenAlex workers=10, Scopus workers=3 - Resume Logic: If enabled, the script skips DOIs already present in the final report. How to Run: Option 1: Double-click run_research_metrics.bat (based on run_research_metrics.txt) Option 2: From CMD: cd /d C:\ResearchMetrics set SCOPUS_API_KEY=YOUR_SCOPUS_KEY set OPENALEX_API_KEY=YOUR_OPENALEX_KEY python research_metrics_pipeline_parallel_v3.py Output: - final_metrics_report.csv with original EPrints fields + OpenAlex and Scopus metrics. - Backup files saved in backup folder if write errors occur. Resume Behavior: - If final_metrics_report.csv exists, script skips processed DOIs and preserves prior metrics. Performance: - Parallel OpenAlex fetch with retries. - Parallel Scopus fetch with global throttling. - Streaming writes every 100 rows. Troubleshooting: - Ensure SCOPUS_API_KEY is set. - Ensure DOI column exists in input CSV. - Close output file if locked. Customization: - Adjust BATCH_SIZE and CHECKPOINT_SIZE. - Tune concurrency and delays. - Update OpenAlex email and BASE_DIR as needed. Notes: - Keep Scopus API key private. - OpenAlex uses email for polite API access.