Download in Bulk
US ETF Constituents
Introduction
Download the US ETF Constituents dataset in bulk to get the full dataset without any ETF selection bias. The bulk dataset package contains constituents data for all of the supported ETFs for every trading day.
To use the CLI, you must be a member in an organization on a paid tier.
Download History
To unlock local access to the US ETF Constituents dataset, open the Pricing page of your organization and subscribe to the data package. If you don't already subscribe to the data package, subscribe to it too. You need billing permissions to change the organization's subscriptions.
After you subscribe to local access, to download the US ETF Consitutents data, open a terminal in your organization workspace and run:
$ lean data download --dataset "US ETF Constituents" --data-type "Bulk" --start "20090601" --end "20500101"
To download the US Equity Security Master, run:
$ lean data download --dataset "US Equity Security Master"
Download Daily Updates
After you bulk download the US ETF Constituents dataset, new daily updates are available at 7 AM Eastern Time (ET) after each trading day. To unlock local access to the data updates, open the Pricing page of your organization and subscribe to the data package. You need billing permissions to change the organization's subscriptions.
After you subscribe to dataset updates, to update your local copy of the US ETF Constituents dataset, open a terminal in your organization workspace and run:
$ lean data download --dataset "US ETF Constituents" --data-type "Bulk" --start "20090601" --end "20500101"
To update your local copy of the US Equity Security Master, run:
$ lean data download --dataset "US Equity Security Master"
Alternatively, instead of directly calling the lean data download
command, you can place a Python script in the data directory of your organization workspace and run it to update your data files. The following example script updates all of the new data that's missing from your local copy:
import os from datetime import datetime, time, timedelta from pytz import timezone from os.path import abspath, dirname os.chdir(dirname(abspath(__file__))) OVERWRITE = False def __get_start_date() -> str: dir_name = f"equity/usa/universes/etf/spy" files = [] if not os.path.exists(dir_name) else sorted(os.listdir(dir_name)) return files[-1].split(".")[0] if files else '19980101' def __get_end_date() -> str: now = datetime.now(timezone("US/Eastern")) if now.time() > time(7, 0): return (now - timedelta(1)).strftime("%Y%m%d") print('New data is available at 07:00 AM EST') return (now - timedelta(2)).strftime("%Y%m%d") if __name__ == "__main__": start, end = __get_start_date(), __get_end_date() if start >= end: exit("Your data is already up to date.") command = f'lean data download --dataset "US ETF Constituents" --data-type "Bulk" --start {start} --end {end}' if OVERWRITE: command += " --overwrite" print(command) os.system(command)
The preceding script checks the date of the most recent SPY data you have. If there is new data available for SPY, it downloads the new data files for all of the ETFs. You may need to adjust this script to fit your needs.
Price
The following table shows the price of an annual subscription to the US Equity Security Master for each organization tier:
Tier | Price ($/Year) |
---|---|
Quant Researcher | 600 |
Team | 900 |
Trading Firm | 1,200 |
Institution | 1,800 |
All of the historical US ETF Constituents data costs $3,960. An annual subscription to daily updates costs $1,200/year.