Download in Bulk
US Equities
Introduction
Download the US Equities dataset in bulk to get the full dataset without any selection bias. The bulk dataset packages contain data for every ticker and trading day. If the resolution you download provides trade and quote data, the bulk download contains both data types. To check which data types each resolution provides, see Resolutions.
To use the CLI, you must be a member in an organization on a paid tier.
Download History
To unlock local access to the US Equities dataset, open the Pricing page of your organization and subscribe to at least one of the following data packages:
- US Equity Daily History by AlgoSeek
- US Equity Hourly History by AlgoSeek
- US Equity Minute History by AlgoSeek
- US Equity Second History by AlgoSeek
- US Equity Tick History by AlgoSeek
If you don't already subscribe to the billing permissions to change the organization's subscriptions.
data package, subscribe to it too. You needAfter you subscribe to local access, to download the US Equities data, follow these steps:
- Log in to the Algorithm Lab.
- On the CLI tab of the dataset listing, use the CLI Command Generator to generate your download command and then copy it.
- Open a terminal in your organization workspace and then run the command from the CLI Command Generator.
The Ticker field is irrelevant for bulk downloads because it downloads data for all the tickers in the dataset.
To download the US Equity Security Master, run:
$ lean data download --dataset "US Equity Security Master"
Download Daily Updates
After you bulk download the US Equities dataset, new daily updates are available at 7 AM Eastern Time (ET) after each trading day. To unlock local access to the data updates, open the Pricing page of your organization and subscribe to at least one of the following data packages:
- US Equity Daily Updates by AlgoSeek
- US Equity Hourly Updates by AlgoSeek
- US Equity Minute Updates by AlgoSeek
- US Equity Second Updates by AlgoSeek
- US Equity Tick Updates by AlgoSeek
You need billing permissions to change the organization's subscriptions.
After you subscribe to dataset updates, to update your local copy of the US Equities dataset, use the CLI Command Generator to generate your download command and then run it in a terminal in your organization workspace. To update your local copy of the US Equity Security Master, run:
$ lean data download --dataset "US Equity Security Master"
Alternatively, instead of directly calling the lean data download
command, you can place a Python script in the data directory of your organization workspace and run it to update your data files. The following example script updates all data resolutions:
import os import pandas as pd from datetime import datetime, time, timedelta from pytz import timezone from os.path import abspath, dirname os.chdir(dirname(abspath(__file__))) OVERWRITE = False # Define a method to download the data def __download_data(resolution, start=None, end=None): print(f"Updating {resolution} data...") command = f'lean data download --dataset "US Equities" --data-type "Bulk" --resolution "{resolution}"' if start: end = end if end else start command += f" --start {start} --end {end}" if OVERWRITE: command += " --overwrite" print(command) os.system(command) def __get_end_date() -> str: now = datetime.now(timezone("US/Eastern")) if now.time() > time(7,30): return (now - timedelta(1)).strftime("%Y%m%d") print('New data is available at 07:30 AM EST') return (now - timedelta(2)).strftime("%Y%m%d") def __download_high_frequency_data(latest_on_cloud): for resolution in ["minute", "second", "tick"]: dir_name = f"equity/usa/{resolution}/spy".lower() if not os.path.exists(dir_name): __download_data(resolution, '19980101') continue latest_on_disk = sorted(os.listdir(dir_name))[-1].split('_')[0] if latest_on_disk >= latest_on_cloud: print(f"{resolution} data is already up to date.") continue __download_data(resolution, latest_on_disk, latest_on_cloud) def __download_low_frequency_data(latest_on_cloud): for resolution in ["daily", "hour"]: file_name = f"equity/usa/{resolution}/spy.zip".lower() if not os.path.exists(file_name): __download_data(resolution) continue latest_on_disk = str(pd.read_csv(file_name, header=None)[0].iloc[-1])[:8] if latest_on_disk >= latest_on_cloud: print(f"{resolution} data is already up to date.") continue __download_data(resolution) if __name__ == "__main__": latest_on_cloud = __get_end_date() __download_low_frequency_data(latest_on_cloud) __download_high_frequency_data(latest_on_cloud)
The preceding script checks the date of the most recent SPY data you have for all resolutions. If there is new data available for any of these resolutions, it downloads the new data files and overwrites your hourly and daily files. If you don't intend to download all resolutions, adjust this script to your needs.
Price
The following table shows the price of an annual subscription to the US Equity Security Master for each organization tier:
Tier | Price ($/Year) |
---|---|
Quant Researcher | 600 |
Team | 900 |
Trading Firm | 1,200 |
Institution | 1,800 |
The following table shows the price of the US Equity dataset subscriptions:
Resolution | Price of Historical Data ($) | Price of Daily Updates ($/Year) |
---|---|---|
Daily | 3,480 | 2,640 |
Hour | 3,480 | 2,640 |
Minute | Contact us | 2,640 |
Second | Contact us | 2,640 |
Tick | Contact us | 2,640 |