API Reference¶
Running a Backtest¶
-
zipline.
run_algorithm
(...)[source]¶ Run a trading algorithm.
- Parameters
start (datetime) – The start date of the backtest.
end (datetime) – The end date of the backtest..
initialize (callable[context -> None]) – The initialize function to use for the algorithm. This is called once at the very begining of the backtest and should be used to set up any state needed by the algorithm.
capital_base (float) – The starting capital for the backtest.
handle_data (callable[(context, BarData) -> None], optional) – The handle_data function to use for the algorithm. This is called every minute when
data_frequency == 'minute'
or every day whendata_frequency == 'daily'
.before_trading_start (callable[(context, BarData) -> None], optional) – The before_trading_start function for the algorithm. This is called once before each trading day (after initialize on the first day).
analyze (callable[(context, pd.DataFrame) -> None], optional) – The analyze function to use for the algorithm. This function is called once at the end of the backtest and is passed the context and the performance data.
data_frequency ({'daily', 'minute'}, optional) – The data frequency to run the algorithm at.
bundle (str, optional) – The name of the data bundle to use to load the data to run the backtest with. This defaults to ‘quantopian-quandl’.
bundle_timestamp (datetime, optional) – The datetime to lookup the bundle data for. This defaults to the current time.
trading_calendar (TradingCalendar, optional) – The trading calendar to use for your backtest.
metrics_set (iterable[Metric] or str, optional) – The set of metrics to compute in the simulation. If a string is passed, resolve the set with
zipline.finance.metrics.load()
.benchmark_returns (pd.Series, optional) – Series of returns to use as the benchmark.
default_extension (bool, optional) – Should the default zipline extension be loaded. This is found at
$ZIPLINE_ROOT/extension.py
extensions (iterable[str], optional) – The names of any other extensions to load. Each element may either be a dotted module path like
a.b.c
or a path to a python file ending in.py
likea/b/c.py
.strict_extensions (bool, optional) – Should the run fail if any extensions fail to load. If this is false, a warning will be raised instead.
environ (mapping[str -> str], optional) – The os environment to use. Many extensions use this to get parameters. This defaults to
os.environ
.blotter (str or zipline.finance.blotter.Blotter, optional) – Blotter to use with this algorithm. If passed as a string, we look for a blotter construction function registered with
zipline.extensions.register
and call it with no parameters. Default is azipline.finance.blotter.SimulationBlotter
that never cancels orders.
- Returns
perf – The daily performance of the algorithm.
- Return type
pd.DataFrame
See also
zipline.data.bundles.bundles()
The available data bundles.
Algorithm API¶
The following methods are available for use in the initialize
,
handle_data
, and before_trading_start
API functions.
In all listed functions, the self
argument is implicitly the
currently-executing TradingAlgorithm
instance.
Data Object¶
-
class
zipline.protocol.
BarData
¶ Provides methods for accessing minutely and daily price/volume data from Algorithm API functions.
Also provides utility methods to determine if an asset is alive, and if it has recent trade data.
An instance of this object is passed as
data
tohandle_data()
andbefore_trading_start()
.- Parameters
data_portal (DataPortal) – Provider for bar pricing data.
simulation_dt_func (callable) – Function which returns the current simulation time. This is usually bound to a method of TradingSimulation.
data_frequency ({'minute', 'daily'}) – The frequency of the bar data; i.e. whether the data is daily or minute bars
restrictions (zipline.finance.asset_restrictions.Restrictions) – Object that combines and returns restricted list information from multiple sources
universe_func (callable, optional) – Function which returns the current ‘universe’. This is for backwards compatibility with older API concepts.
-
can_trade
()¶ For the given asset or iterable of assets, returns True if all of the following are true:
The asset is alive for the session of the current simulation time (if current simulation time is not a market minute, we use the next session).
The asset’s exchange is open at the current simulation time or at the simulation calendar’s next market minute.
There is a known last price for the asset.
- Parameters
assets (zipline.assets.Asset or iterable of zipline.assets.Asset) – Asset(s) for which tradability should be determined.
Notes
The second condition above warrants some further explanation:
If the asset’s exchange calendar is identical to the simulation calendar, then this condition always returns True.
If there are market minutes in the simulation calendar outside of this asset’s exchange’s trading hours (for example, if the simulation is running on the CMES calendar but the asset is MSFT, which trades on the NYSE), during those minutes, this condition will return False (for example, 3:15 am Eastern on a weekday, during which the CMES is open but the NYSE is closed).
-
current
()¶ Returns the “current” value of the given fields for the given assets at the current simulation time.
- Parameters
assets (zipline.assets.Asset or iterable of zipline.assets.Asset) – The asset(s) for which data is requested.
fields (str or iterable[str]) – Requested data field(s). Valid field names are: “price”, “last_traded”, “open”, “high”, “low”, “close”, and “volume”.
- Returns
current_value – See notes below.
- Return type
Scalar, pandas Series, or pandas DataFrame.
Notes
The return type of this function depends on the types of its inputs:
If a single asset and a single field are requested, the returned value is a scalar (either a float or a
pd.Timestamp
depending on the field).If a single asset and a list of fields are requested, the returned value is a
pd.Series
whose indices are the requested fields.If a list of assets and a single field are requested, the returned value is a
pd.Series
whose indices are the assets.If a list of assets and a list of fields are requested, the returned value is a
pd.DataFrame
. The columns of the returned frame will be the requested fields, and the index of the frame will be the requested assets.
The values produced for
fields
are as follows:Requesting “price” produces the last known close price for the asset, forward-filled from an earlier minute if there is no trade this minute. If there is no last known value (either because the asset has never traded, or because it has delisted) NaN is returned. If a value is found, and we had to cross an adjustment boundary (split, dividend, etc) to get it, the value is adjusted to the current simulation time before being returned.
Requesting “open”, “high”, “low”, or “close” produces the open, high, low, or close for the current minute. If no trades occurred this minute,
NaN
is returned.Requesting “volume” produces the trade volume for the current minute. If no trades occurred this minute, 0 is returned.
Requesting “last_traded” produces the datetime of the last minute in which the asset traded, even if the asset has stopped trading. If there is no last known value,
pd.NaT
is returned.
If the current simulation time is not a valid market time for an asset, we use the most recent market close instead.
-
history
()¶ Returns a trailing window of length
bar_count
containing data for the given assets, fields, and frequency.Returned data is adjusted for splits, dividends, and mergers as of the current simulation time.
The semantics for missing data are identical to the ones described in the notes for
current()
.- Parameters
assets (zipline.assets.Asset or iterable of zipline.assets.Asset) – The asset(s) for which data is requested.
fields (string or iterable of string.) – Requested data field(s). Valid field names are: “price”, “last_traded”, “open”, “high”, “low”, “close”, and “volume”.
bar_count (int) – Number of data observations requested.
frequency (str) – String indicating whether to load daily or minutely data observations. Pass ‘1m’ for minutely data, ‘1d’ for daily data.
- Returns
history – See notes below.
- Return type
pd.Series or pd.DataFrame or pd.Panel
Notes
The return type of this function depends on the types of
assets
andfields
:If a single asset and a single field are requested, the returned value is a
pd.Series
of lengthbar_count
whose index ispd.DatetimeIndex
.If a single asset and multiple fields are requested, the returned value is a
pd.DataFrame
with shape(bar_count, len(fields))
. The frame’s index will be apd.DatetimeIndex
, and its columns will befields
.If multiple assets and a single field are requested, the returned value is a
pd.DataFrame
with shape(bar_count, len(assets))
. The frame’s index will be apd.DatetimeIndex
, and its columns will beassets
.If multiple assets and multiple fields are requested, the returned value is a
pd.Panel
with shape(len(fields), bar_count, len(assets))
. The axes of the returned panel will be:panel.items
:fields
panel.major_axis
:pd.DatetimeIndex
of lengthbar_count
panel.minor_axis
:assets
If the current simulation time is not a valid market time, we use the last market close instead.
-
is_stale
()¶ For the given asset or iterable of assets, returns True if the asset is alive and there is no trade data for the current simulation time.
If the asset has never traded, returns False.
If the current simulation time is not a valid market time, we use the current time to check if the asset is alive, but we use the last market minute/day for the trade data check.
- Parameters
assets (zipline.assets.Asset or iterable of zipline.assets.Asset) – Asset(s) for which staleness should be determined.
- Returns
is_stale – Bool or series of bools indicating whether the requested asset(s) are stale.
- Return type
Scheduling Functions¶
-
zipline.api.
schedule_function
(self, func, date_rule=None, time_rule=None, half_days=True, calendar=None)¶ Schedule a function to be called repeatedly in the future.
- Parameters
func (callable) – The function to execute when the rule is triggered.
func
should have the same signature ashandle_data
.date_rule (zipline.utils.events.EventRule, optional) – Rule for the dates on which to execute
func
. If not passed, the function will run every trading day.time_rule (zipline.utils.events.EventRule, optional) – Rule for the time at which to execute
func
. If not passed, the function will execute at the end of the first market minute of the day.half_days (bool, optional) – Should this rule fire on half days? Default is True.
calendar (Sentinel, optional) – Calendar used to compute rules that depend on the trading calendar.
See also
-
class
zipline.api.
date_rules
[source]¶ Factories for date-based
schedule_function()
rules.See also
-
static
every_day
()[source]¶ Create a rule that triggers every day.
- Returns
rule
- Return type
zipline.utils.events.EventRule
-
static
month_end
(days_offset=0)[source]¶ Create a rule that triggers a fixed number of trading days before the end of each month.
- Parameters
days_offset (int, optional) – Number of trading days prior to month end to trigger. Default is 0, i.e., trigger on the last day of the month.
- Returns
rule
- Return type
zipline.utils.events.EventRule
-
static
month_start
(days_offset=0)[source]¶ Create a rule that triggers a fixed number of trading days after the start of each month.
- Parameters
days_offset (int, optional) – Number of trading days to wait before triggering each month. Default is 0, i.e., trigger on the first trading day of the month.
- Returns
rule
- Return type
zipline.utils.events.EventRule
-
static
-
class
zipline.api.
time_rules
[source]¶ Factories for time-based
schedule_function()
rules.See also
-
every_minute
¶ alias of
Always
-
static
market_close
(offset=None, hours=None, minutes=None)[source]¶ Create a rule that triggers at a fixed offset from market close.
The offset can be specified either as a
datetime.timedelta
, or as a number of hours and minutes.- Parameters
offset (datetime.timedelta, optional) – If passed, the offset from market close at which to trigger. Must be at least 1 minute.
hours (int, optional) – If passed, number of hours to wait before market close.
minutes (int, optional) – If passed, number of minutes to wait before market close.
- Returns
rule
- Return type
zipline.utils.events.EventRule
Notes
If no arguments are passed, the default offset is one minute before market close.
If
offset
is passed,hours
andminutes
must not be passed. Conversely, if eitherhours
orminutes
are passed,offset
must not be passed.
-
static
market_open
(offset=None, hours=None, minutes=None)[source]¶ Create a rule that triggers at a fixed offset from market open.
The offset can be specified either as a
datetime.timedelta
, or as a number of hours and minutes.- Parameters
offset (datetime.timedelta, optional) – If passed, the offset from market open at which to trigger. Must be at least 1 minute.
hours (int, optional) – If passed, number of hours to wait after market open.
minutes (int, optional) – If passed, number of minutes to wait after market open.
- Returns
rule
- Return type
zipline.utils.events.EventRule
Notes
If no arguments are passed, the default offset is one minute after market open.
If
offset
is passed,hours
andminutes
must not be passed. Conversely, if eitherhours
orminutes
are passed,offset
must not be passed.
-
Orders¶
-
zipline.api.
order
(self, asset, amount, limit_price=None, stop_price=None, style=None)¶ Place an order for a fixed number of shares.
- Parameters
asset (Asset) – The asset to be ordered.
amount (int) – The amount of shares to order. If
amount
is positive, this is the number of shares to buy or cover. Ifamount
is negative, this is the number of shares to sell or short.limit_price (float, optional) – The limit price for the order.
stop_price (float, optional) – The stop price for the order.
style (ExecutionStyle, optional) – The execution style for the order.
- Returns
order_id – The unique identifier for this order, or None if no order was placed.
- Return type
Notes
The
limit_price
andstop_price
arguments provide shorthands for passing common execution styles. Passinglimit_price=N
is equivalent tostyle=LimitOrder(N)
. Similarly, passingstop_price=M
is equivalent tostyle=StopOrder(M)
, and passinglimit_price=N
andstop_price=M
is equivalent tostyle=StopLimitOrder(N, M)
. It is an error to pass both astyle
andlimit_price
orstop_price
.
-
zipline.api.
order_value
(self, asset, value, limit_price=None, stop_price=None, style=None)¶ Place an order for a fixed amount of money.
Equivalent to
order(asset, value / data.current(asset, 'price'))
.- Parameters
asset (Asset) – The asset to be ordered.
value (float) – Amount of value of
asset
to be transacted. The number of shares bought or sold will be equal tovalue / current_price
.limit_price (float, optional) – Limit price for the order.
stop_price (float, optional) – Stop price for the order.
style (ExecutionStyle) – The execution style for the order.
- Returns
order_id – The unique identifier for this order.
- Return type
Notes
See
zipline.api.order()
for more information aboutlimit_price
,stop_price
, andstyle
-
zipline.api.
order_percent
(self, asset, percent, limit_price=None, stop_price=None, style=None)¶ Place an order in the specified asset corresponding to the given percent of the current portfolio value.
- Parameters
asset (Asset) – The asset that this order is for.
percent (float) – The percentage of the portfolio value to allocate to
asset
. This is specified as a decimal, for example: 0.50 means 50%.limit_price (float, optional) – The limit price for the order.
stop_price (float, optional) – The stop price for the order.
style (ExecutionStyle) – The execution style for the order.
- Returns
order_id – The unique identifier for this order.
- Return type
Notes
See
zipline.api.order()
for more information aboutlimit_price
,stop_price
, andstyle
-
zipline.api.
order_target
(self, asset, target, limit_price=None, stop_price=None, style=None)¶ Place an order to adjust a position to a target number of shares. If the position doesn’t already exist, this is equivalent to placing a new order. If the position does exist, this is equivalent to placing an order for the difference between the target number of shares and the current number of shares.
- Parameters
asset (Asset) – The asset that this order is for.
target (int) – The desired number of shares of
asset
.limit_price (float, optional) – The limit price for the order.
stop_price (float, optional) – The stop price for the order.
style (ExecutionStyle) – The execution style for the order.
- Returns
order_id – The unique identifier for this order.
- Return type
Notes
order_target
does not take into account any open orders. For example:order_target(sid(0), 10) order_target(sid(0), 10)
This code will result in 20 shares of
sid(0)
because the first call toorder_target
will not have been filled when the secondorder_target
call is made.See
zipline.api.order()
for more information aboutlimit_price
,stop_price
, andstyle
-
zipline.api.
order_target_value
(self, asset, target, limit_price=None, stop_price=None, style=None)¶ Place an order to adjust a position to a target value. If the position doesn’t already exist, this is equivalent to placing a new order. If the position does exist, this is equivalent to placing an order for the difference between the target value and the current value. If the Asset being ordered is a Future, the ‘target value’ calculated is actually the target exposure, as Futures have no ‘value’.
- Parameters
asset (Asset) – The asset that this order is for.
target (float) – The desired total value of
asset
.limit_price (float, optional) – The limit price for the order.
stop_price (float, optional) – The stop price for the order.
style (ExecutionStyle) – The execution style for the order.
- Returns
order_id – The unique identifier for this order.
- Return type
Notes
order_target_value
does not take into account any open orders. For example:order_target_value(sid(0), 10) order_target_value(sid(0), 10)
This code will result in 20 dollars of
sid(0)
because the first call toorder_target_value
will not have been filled when the secondorder_target_value
call is made.See
zipline.api.order()
for more information aboutlimit_price
,stop_price
, andstyle
-
zipline.api.
order_target_percent
(self, asset, target, limit_price=None, stop_price=None, style=None)¶ Place an order to adjust a position to a target percent of the current portfolio value. If the position doesn’t already exist, this is equivalent to placing a new order. If the position does exist, this is equivalent to placing an order for the difference between the target percent and the current percent.
- Parameters
asset (Asset) – The asset that this order is for.
target (float) – The desired percentage of the portfolio value to allocate to
asset
. This is specified as a decimal, for example: 0.50 means 50%.limit_price (float, optional) – The limit price for the order.
stop_price (float, optional) – The stop price for the order.
style (ExecutionStyle) – The execution style for the order.
- Returns
order_id – The unique identifier for this order.
- Return type
Notes
order_target_value
does not take into account any open orders. For example:order_target_percent(sid(0), 10) order_target_percent(sid(0), 10)
This code will result in 20% of the portfolio being allocated to sid(0) because the first call to
order_target_percent
will not have been filled when the secondorder_target_percent
call is made.See
zipline.api.order()
for more information aboutlimit_price
,stop_price
, andstyle
-
class
zipline.finance.execution.
ExecutionStyle
[source]¶ Base class for order execution styles.
-
property
exchange
¶ The exchange to which this order should be routed.
-
property
-
class
zipline.finance.execution.
MarketOrder
(exchange=None)[source]¶ Execution style for orders to be filled at current market price.
This is the default for orders placed with
order()
.
-
class
zipline.finance.execution.
LimitOrder
(limit_price, asset=None, exchange=None)[source]¶ Execution style for orders to be filled at a price equal to or better than a specified limit price.
- Parameters
limit_price (float) – Maximum price for buys, or minimum price for sells, at which the order should be filled.
-
class
zipline.finance.execution.
StopOrder
(stop_price, asset=None, exchange=None)[source]¶ Execution style representing a market order to be placed if market price reaches a threshold.
- Parameters
stop_price (float) – Price threshold at which the order should be placed. For sells, the order will be placed if market price falls below this value. For buys, the order will be placed if market price rises above this value.
-
class
zipline.finance.execution.
StopLimitOrder
(limit_price, stop_price, asset=None, exchange=None)[source]¶ Execution style representing a limit order to be placed if market price reaches a threshold.
- Parameters
limit_price (float) – Maximum price for buys, or minimum price for sells, at which the order should be filled, if placed.
stop_price (float) – Price threshold at which the order should be placed. For sells, the order will be placed if market price falls below this value. For buys, the order will be placed if market price rises above this value.
-
zipline.api.
get_order
(self, order_id)¶ Lookup an order based on the order id returned from one of the order functions.
- Parameters
order_id (str) – The unique identifier for the order.
- Returns
order – The order object.
- Return type
Order
-
zipline.api.
get_open_orders
(self, asset=None)¶ Retrieve all of the current open orders.
- Parameters
asset (Asset) – If passed and not None, return only the open orders for the given asset instead of all open orders.
- Returns
open_orders – If no asset is passed this will return a dict mapping Assets to a list containing all the open orders for the asset. If an asset is passed then this will return a list of the open orders for this asset.
- Return type
-
zipline.api.
cancel_order
(self, order_param)¶ Cancel an open order.
- Parameters
order_param (str or Order) – The order_id or order object to cancel.
Order Cancellation Policies¶
-
zipline.api.
set_cancel_policy
(self, cancel_policy)¶ Sets the order cancellation policy for the simulation.
- Parameters
cancel_policy (CancelPolicy) – The cancellation policy to use.
See also
-
class
zipline.finance.cancel_policy.
CancelPolicy
[source]¶ Abstract cancellation policy interface.
-
abstract
should_cancel
(event)[source]¶ Should all open orders be cancelled?
- Parameters
event (enum-value) –
- An event type, one of:
zipline.gens.sim_engine.BAR
zipline.gens.sim_engine.DAY_START
zipline.gens.sim_engine.DAY_END
zipline.gens.sim_engine.MINUTE_END
- Returns
should_cancel – Should all open orders be cancelled?
- Return type
-
abstract
Assets¶
-
zipline.api.
symbol
(self, symbol_str, country_code=None)¶ Lookup an Equity by its ticker symbol.
- Parameters
- Returns
equity – The equity that held the ticker symbol on the current symbol lookup date.
- Return type
- Raises
SymbolNotFound – Raised when the symbols was not held on the current lookup date.
See also
-
zipline.api.
symbols
(self, *args, **kwargs)¶ Lookup multuple Equities as a list.
- Parameters
- Returns
equities – The equities that held the given ticker symbols on the current symbol lookup date.
- Return type
- Raises
SymbolNotFound – Raised when one of the symbols was not held on the current lookup date.
See also
-
zipline.api.
future_symbol
(self, symbol)¶ Lookup a futures contract with a given symbol.
- Parameters
symbol (str) – The symbol of the desired contract.
- Returns
future – The future that trades with the name
symbol
.- Return type
- Raises
SymbolNotFound – Raised when no contract named ‘symbol’ is found.
-
zipline.api.
set_symbol_lookup_date
(self, dt)¶ Set the date for which symbols will be resolved to their assets (symbols may map to different firms or underlying assets at different times)
- Parameters
dt (datetime) – The new symbol lookup date.
Trading Controls¶
Zipline provides trading controls to help ensure that the algorithm is performing as expected. The functions help protect the algorithm from certian bugs that could cause undesirable behavior when trading with real money.
-
zipline.api.
set_do_not_order_list
(self, restricted_list, on_error='fail')¶ Set a restriction on which assets can be ordered.
- Parameters
restricted_list (container[Asset], SecurityList) – The assets that cannot be ordered.
-
zipline.api.
set_long_only
(self, on_error='fail')¶ Set a rule specifying that this algorithm cannot take short positions.
-
zipline.api.
set_max_leverage
(self, max_leverage)¶ Set a limit on the maximum leverage of the algorithm.
- Parameters
max_leverage (float) – The maximum leverage for the algorithm. If not provided there will be no maximum.
-
zipline.api.
set_max_order_count
(self, max_count, on_error='fail')¶ Set a limit on the number of orders that can be placed in a single day.
- Parameters
max_count (int) – The maximum number of orders that can be placed on any single day.
-
zipline.api.
set_max_order_size
(self, asset=None, max_shares=None, max_notional=None, on_error='fail')¶ Set a limit on the number of shares and/or dollar value of any single order placed for sid. Limits are treated as absolute values and are enforced at the time that the algo attempts to place an order for sid.
If an algorithm attempts to place an order that would result in exceeding one of these limits, raise a TradingControlException.
-
zipline.api.
set_max_position_size
(self, asset=None, max_shares=None, max_notional=None, on_error='fail')¶ Set a limit on the number of shares and/or dollar value held for the given sid. Limits are treated as absolute values and are enforced at the time that the algo attempts to place an order for sid. This means that it’s possible to end up with more than the max number of shares due to splits/dividends, and more than the max notional due to price improvement.
If an algorithm attempts to place an order that would result in increasing the absolute value of shares/dollar value exceeding one of these limits, raise a TradingControlException.
Simulation Parameters¶
-
zipline.api.
set_benchmark
(self, benchmark)¶ Set the benchmark asset.
- Parameters
benchmark (zipline.assets.Asset) – The asset to set as the new benchmark.
Notes
Any dividends payed out for that new benchmark asset will be automatically reinvested.
Commission Models¶
-
zipline.api.
set_commission
(self, us_equities=None, us_futures=None)¶ Sets the commission models for the simulation.
- Parameters
us_equities (EquityCommissionModel) – The commission model to use for trading US equities.
us_futures (FutureCommissionModel) – The commission model to use for trading US futures.
Notes
This function can only be called during
initialize()
.
-
class
zipline.finance.commission.
CommissionModel
[source]¶ Abstract base class for commission models.
Commission models are responsible for accepting order/transaction pairs and calculating how much commission should be charged to an algorithm’s account on each transaction.
To implement a new commission model, create a subclass of
CommissionModel
and implementcalculate()
.-
abstract
calculate
(order, transaction)[source]¶ Calculate the amount of commission to charge on
order
as a result oftransaction
.- Parameters
order (zipline.finance.order.Order) –
The order being processed.
The
commission
field oforder
is a float indicating the amount of commission already charged on this order.transaction (zipline.finance.transaction.Transaction) – The transaction being processed. A single order may generate multiple transactions if there isn’t enough volume in a given bar to fill the full amount requested in the order.
- Returns
amount_charged – The additional commission, in dollars, that we should attribute to this order.
- Return type
-
abstract
Calculates a commission for a transaction based on a per share cost with an optional minimum cost per trade.
- Parameters
Notes
This is zipline’s default commission model for equities.
-
class
zipline.finance.commission.
PerTrade
(cost=0.0)[source]¶ Calculates a commission for a transaction based on a per trade cost.
For orders that require multiple fills, the full commission is charged to the first fill.
- Parameters
cost (float, optional) – The flat amount of commissions paid per equity trade.
Slippage Models¶
-
zipline.api.
set_slippage
(self, us_equities=None, us_futures=None)¶ Set the slippage models for the simulation.
- Parameters
us_equities (EquitySlippageModel) – The slippage model to use for trading US equities.
us_futures (FutureSlippageModel) – The slippage model to use for trading US futures.
Notes
This function can only be called during
initialize()
.
-
class
zipline.finance.slippage.
SlippageModel
[source]¶ Abstract base class for slippage models.
Slippage models are responsible for the rates and prices at which orders fill during a simulation.
To implement a new slippage model, create a subclass of
SlippageModel
and implementprocess_order()
.-
volume_for_bar
¶ Number of shares that have already been filled for the currently-filling asset in the current minute. This attribute is maintained automatically by the base class. It can be used by subclasses to keep track of the total amount filled if there are multiple open orders for a single asset.
- Type
Notes
Subclasses that define their own constructors should call
super(<subclass name>, self).__init__()
before performing other initialization.-
abstract
process_order
(data, order)[source]¶ Compute the number of shares and price to fill for
order
in the current minute.- Parameters
data (zipline.protocol.BarData) – The data for the given bar.
order (zipline.finance.order.Order) – The order to simulate.
- Returns
execution_price (float) – The price of the fill.
execution_volume (int) – The number of shares that should be filled. Must be between
0
andorder.amount - order.filled
. If the amount filled is less than the amount remaining,order
will remain open and will be passed again to this method in the next minute.
- Raises
zipline.finance.slippage.LiquidityExceeded – May be raised if no more orders should be processed for the current asset during the current bar.
Notes
Before this method is called,
volume_for_bar
will be set to the number of shares that have already been filled fororder.asset
in the current minute.process_order()
is not called by the base class on bars for which there was no historical volume.
-
-
class
zipline.finance.slippage.
FixedSlippage
(spread=0.0)[source]¶ Simple model assuming a fixed-size spread for all assets.
- Parameters
spread (float, optional) – Size of the assumed spread for all assets. Orders to buy will be filled at
close + (spread / 2)
. Orders to sell will be filled atclose - (spread / 2)
.
Notes
This model does not impose limits on the size of fills. An order for an asset will always be filled as soon as any trading activity occurs in the order’s asset, even if the size of the order is greater than the historical volume.
Model slippage as a quadratic function of percentage of historical volume.
Orders to buy will be filled at:
price * (1 + price_impact * (volume_share ** 2))
Orders to sell will be filled at:
price * (1 - price_impact * (volume_share ** 2))
where
price
is the close price for the bar, andvolume_share
is the percentage of minutely volume filled, up to a max ofvolume_limit
.- Parameters
volume_limit (float, optional) – Maximum percent of historical volume that can fill in each bar. 0.5 means 50% of historical volume. 1.0 means 100%. Default is 0.025 (i.e., 2.5%).
price_impact (float, optional) – Scaling coefficient for price impact. Larger values will result in more simulated price impact. Smaller values will result in less simulated price impact. Default is 0.1.
Pipeline¶
For more information, see Pipeline API
-
zipline.api.
attach_pipeline
(self, pipeline, name, chunks=None, eager=True)¶ Register a pipeline to be computed at the start of each day.
- Parameters
pipeline (Pipeline) – The pipeline to have computed.
name (str) – The name of the pipeline.
chunks (int or iterator, optional) – The number of days to compute pipeline results for. Increasing this number will make it longer to get the first results but may improve the total runtime of the simulation. If an iterator is passed, we will run in chunks based on values of the iterator. Default is True.
eager (bool, optional) – Whether or not to compute this pipeline prior to before_trading_start.
- Returns
pipeline – Returns the pipeline that was attached unchanged.
- Return type
See also
-
zipline.api.
pipeline_output
(self, name)¶ Get results of the pipeline attached by with name
name
.- Parameters
name (str) – Name of the pipeline from which to fetch results.
- Returns
results – DataFrame containing the results of the requested pipeline for the current simulation date.
- Return type
pd.DataFrame
- Raises
NoSuchPipeline – Raised when no pipeline with the name name has been registered.
Miscellaneous¶
-
zipline.api.
record
(self, *args, **kwargs)¶ Track and record values each day.
- Parameters
**kwargs – The names and values to record.
Notes
These values will appear in the performance packets and the performance dataframe passed to
analyze
and returned fromrun_algorithm()
.
-
zipline.api.
get_environment
(self, field='platform')¶ Query the execution environment.
- Parameters
field ({'platform', 'arena', 'data_frequency',) –
‘start’, ‘end’, ‘capital_base’, ‘platform’, ‘*’} The field to query. The options have the following meanings:
- arenastr
The arena from the simulation parameters. This will normally be
'backtest'
but some systems may use this distinguish live trading from backtesting.- data_frequency{‘daily’, ‘minute’}
data_frequency tells the algorithm if it is running with daily data or minute data.
- startdatetime
The start date for the simulation.
- enddatetime
The end date for the simulation.
- capital_basefloat
The starting capital for the simulation.
- platformstr
The platform that the code is running on. By default this will be the string ‘zipline’. This can allow algorithms to know if they are running on the Quantopian platform instead.
- : dict[str -> any]
Returns all of the fields in a dictionary.
- Returns
val – The value for the field queried. See above for more information.
- Return type
any
- Raises
ValueError – Raised when
field
is not a valid option.
-
zipline.api.
fetch_csv
(self, url, pre_func=None, post_func=None, date_column='date', date_format=None, timezone='UTC', symbol=None, mask=True, symbol_column=None, special_params_checker=None, country_code=None, **kwargs)¶ Fetch a csv from a remote url and register the data so that it is queryable from the
data
object.- Parameters
url (str) – The url of the csv file to load.
pre_func (callable[pd.DataFrame -> pd.DataFrame], optional) – A callback to allow preprocessing the raw data returned from fetch_csv before dates are paresed or symbols are mapped.
post_func (callable[pd.DataFrame -> pd.DataFrame], optional) – A callback to allow postprocessing of the data after dates and symbols have been mapped.
date_column (str, optional) – The name of the column in the preprocessed dataframe containing datetime information to map the data.
date_format (str, optional) – The format of the dates in the
date_column
. If not providedfetch_csv
will attempt to infer the format. For information about the format of this string, seepandas.read_csv()
.timezone (tzinfo or str, optional) – The timezone for the datetime in the
date_column
.symbol (str, optional) – If the data is about a new asset or index then this string will be the name used to identify the values in
data
. For example, one may usefetch_csv
to load data for VIX, then this field could be the string'VIX'
.mask (bool, optional) – Drop any rows which cannot be symbol mapped.
symbol_column (str) – If the data is attaching some new attribute to each asset then this argument is the name of the column in the preprocessed dataframe containing the symbols. This will be used along with the date information to map the sids in the asset finder.
country_code (str, optional) – Country code to use to disambiguate symbol lookups.
**kwargs – Forwarded to
pandas.read_csv()
.
- Returns
csv_data_source – A requests source that will pull data from the url specified.
- Return type
zipline.sources.requests_csv.PandasRequestsCSV
Blotters¶
-
class
zipline.finance.blotter.blotter.
Blotter
(cancel_policy=None)[source]¶ -
batch_order
(order_arg_lists)[source]¶ Place a batch of orders.
- Parameters
order_arg_lists (iterable[tuple]) – Tuples of args that order expects.
- Returns
order_ids – The unique identifier (or None) for each of the orders placed (or not placed).
- Return type
Notes
This is required for Blotter subclasses to be able to place a batch of orders, instead of being passed the order requests one at a time.
-
abstract
cancel_all_orders_for_asset
(asset, warn=False, relay_status=True)[source]¶ Cancel all open orders for a given asset.
-
abstract
get_transactions
(bar_data)[source]¶ Creates a list of transactions based on the current open orders, slippage model, and commission model.
- Parameters
bar_data (zipline._protocol.BarData) –
Notes
- This method book-keeps the blotter’s open_orders dictionary, so that
it is accurate by the time we’re done processing open orders.
- Returns
transactions_list (List) – transactions_list: list of transactions resulting from the current open orders. If there were no open orders, an empty list is returned.
commissions_list (List) – commissions_list: list of commissions resulting from filling the open orders. A commission is an object with “asset” and “cost” parameters.
closed_orders (List) – closed_orders: list of all the orders that have filled.
-
abstract
hold
(order_id, reason='')[source]¶ Mark the order with order_id as ‘held’. Held is functionally similar to ‘open’. When a fill (full or partial) arrives, the status will automatically change back to open/filled as necessary.
-
abstract
order
(asset, amount, style, order_id=None)[source]¶ Place an order.
- Parameters
asset (zipline.assets.Asset) – The asset that this order is for.
amount (int) – The amount of shares to order. If
amount
is positive, this is the number of shares to buy or cover. Ifamount
is negative, this is the number of shares to sell or short.style (zipline.finance.execution.ExecutionStyle) – The execution style for the order.
order_id (str, optional) – The unique identifier for this order.
- Returns
order_id – The unique identifier for this order, or None if no order was placed.
- Return type
Notes
amount > 0 :: Buy/Cover amount < 0 :: Sell/Short Market order: order(asset, amount) Limit order: order(asset, amount, style=LimitOrder(limit_price)) Stop order: order(asset, amount, style=StopOrder(stop_price)) StopLimit order: order(asset, amount, style=StopLimitOrder(limit_price,
stop_price))
-
abstract
process_splits
(splits)[source]¶ Processes a list of splits by modifying any open orders as needed.
-
abstract
prune_orders
(closed_orders)[source]¶ Removes all given orders from the blotter’s open_orders list.
- Parameters
closed_orders (iterable of orders that are closed.) –
- Returns
- Return type
-
abstract
reject
(order_id, reason='')[source]¶ Mark the given order as ‘rejected’, which is functionally similar to cancelled. The distinction is that rejections are involuntary (and usually include a message from a broker indicating why the order was rejected) while cancels are typically user-driven.
-
-
class
zipline.finance.blotter.
SimulationBlotter
(equity_slippage=None, future_slippage=None, equity_commission=None, future_commission=None, cancel_policy=None)[source]¶ -
-
cancel_all_orders_for_asset
(asset, warn=False, relay_status=True)[source]¶ Cancel all open orders for a given asset.
-
get_transactions
(bar_data)[source]¶ Creates a list of transactions based on the current open orders, slippage model, and commission model.
- Parameters
bar_data (zipline._protocol.BarData) –
Notes
- This method book-keeps the blotter’s open_orders dictionary, so that
it is accurate by the time we’re done processing open orders.
- Returns
transactions_list (List) – transactions_list: list of transactions resulting from the current open orders. If there were no open orders, an empty list is returned.
commissions_list (List) – commissions_list: list of commissions resulting from filling the open orders. A commission is an object with “asset” and “cost” parameters.
closed_orders (List) – closed_orders: list of all the orders that have filled.
-
hold
(order_id, reason='')[source]¶ Mark the order with order_id as ‘held’. Held is functionally similar to ‘open’. When a fill (full or partial) arrives, the status will automatically change back to open/filled as necessary.
-
order
(asset, amount, style, order_id=None)[source]¶ Place an order.
- Parameters
asset (zipline.assets.Asset) – The asset that this order is for.
amount (int) – The amount of shares to order. If
amount
is positive, this is the number of shares to buy or cover. Ifamount
is negative, this is the number of shares to sell or short.style (zipline.finance.execution.ExecutionStyle) – The execution style for the order.
order_id (str, optional) – The unique identifier for this order.
- Returns
order_id – The unique identifier for this order, or None if no order was placed.
- Return type
Notes
amount > 0 :: Buy/Cover amount < 0 :: Sell/Short Market order: order(asset, amount) Limit order: order(asset, amount, style=LimitOrder(limit_price)) Stop order: order(asset, amount, style=StopOrder(stop_price)) StopLimit order: order(asset, amount, style=StopLimitOrder(limit_price,
stop_price))
-
Pipeline API¶
-
class
zipline.pipeline.
Pipeline
(columns=None, screen=None, domain=GENERIC)[source]¶ A Pipeline object represents a collection of named expressions to be compiled and executed by a PipelineEngine.
A Pipeline has two important attributes: ‘columns’, a dictionary of named
Term
instances, and ‘screen’, aFilter
representing criteria for including an asset in the results of a Pipeline.To compute a pipeline in the context of a TradingAlgorithm, users must call
attach_pipeline
in theirinitialize
function to register that the pipeline should be computed each trading day. The most recent outputs of an attached pipeline can be retrieved by callingpipeline_output
fromhandle_data
,before_trading_start
, or a scheduled function.- Parameters
columns (dict, optional) – Initial columns.
screen (zipline.pipeline.Filter, optional) – Initial screen.
-
add
(term, name, overwrite=False)[source]¶ Add a column.
The results of computing
term
will show up as a column in the DataFrame produced by running this pipeline.- Parameters
column (zipline.pipeline.Term) – A Filter, Factor, or Classifier to add to the pipeline.
name (str) – Name of the column to add.
overwrite (bool) – Whether to overwrite the existing entry if we already have a column named name.
-
domain
(default)[source]¶ Get the domain for this pipeline.
If an explicit domain was provided at construction time, use it.
Otherwise, infer a domain from the registered columns.
If no domain can be inferred, return
default
.
- Parameters
default (zipline.pipeline.domain.Domain) – Domain to use if no domain can be inferred from this pipeline by itself.
- Returns
domain – The domain for the pipeline.
- Return type
zipline.pipeline.domain.Domain
- Raises
AmbiguousDomain –
ValueError – If the terms in
self
conflict with self._domain.
-
remove
(name)[source]¶ Remove a column.
-
set_screen
(screen, overwrite=False)[source]¶ Set a screen on this Pipeline.
- Parameters
filter (zipline.pipeline.Filter) – The filter to apply as a screen.
overwrite (bool) – Whether to overwrite any existing screen. If overwrite is False and self.screen is not None, we raise an error.
-
show_graph
(format='svg')[source]¶ Render this Pipeline as a DAG.
- Parameters
format ({'svg', 'png', 'jpeg'}) – Image format to render with. Default is ‘svg’.
-
to_execution_plan
(domain, default_screen, start_date, end_date)[source]¶ Compile into an ExecutionPlan.
- Parameters
domain (zipline.pipeline.domain.Domain) – Domain on which the pipeline will be executed.
default_screen (zipline.pipeline.Term) – Term to use as a screen if self.screen is None.
all_dates (pd.DatetimeIndex) – A calendar of dates to use to calculate starts and ends for each term.
start_date (pd.Timestamp) – The first date of requested output.
end_date (pd.Timestamp) – The last date of requested output.
- Returns
graph – Graph encoding term dependencies, including metadata about extra row requirements.
- Return type
zipline.pipeline.graph.ExecutionPlan
-
to_simple_graph
(default_screen)[source]¶ Compile into a simple TermGraph with no extra row metadata.
- Parameters
default_screen (zipline.pipeline.Term) – Term to use as a screen if self.screen is None.
- Returns
graph – Graph encoding term dependencies.
- Return type
zipline.pipeline.graph.TermGraph
-
property
columns
¶ The output columns of this pipeline.
-
property
screen
¶ The screen of this pipeline.
- Returns
screen – Term defining the screen for this pipeline. If
screen
is a filter, rows that do not pass the filter (i.e., rows for which the filter computedFalse
) will be dropped from the output of this pipeline before returning results.- Return type
Notes
Setting a screen on a Pipeline does not change the values produced for any rows: it only affects whether a given row is returned. Computing a pipeline with a screen is logically equivalent to computing the pipeline without the screen and then, as a post-processing-step, filtering out any rows for which the screen computed
False
.
-
class
zipline.pipeline.
CustomFactor
(inputs=sentinel('NotSpecified'), outputs=sentinel('NotSpecified'), window_length=sentinel('NotSpecified'), mask=sentinel('NotSpecified'), dtype=sentinel('NotSpecified'), missing_value=sentinel('NotSpecified'), ndim=sentinel('NotSpecified'), **kwargs)[source]¶ Base class for user-defined Factors.
- Parameters
inputs (iterable, optional) – An iterable of BoundColumn instances (e.g. USEquityPricing.close), describing the data to load and pass to self.compute. If this argument is not passed to the CustomFactor constructor, we look for a class-level attribute named inputs.
outputs (iterable[str], optional) – An iterable of strings which represent the names of each output this factor should compute and return. If this argument is not passed to the CustomFactor constructor, we look for a class-level attribute named outputs.
window_length (int, optional) – Number of rows to pass for each input. If this argument is not passed to the CustomFactor constructor, we look for a class-level attribute named window_length.
mask (zipline.pipeline.Filter, optional) – A Filter describing the assets on which we should compute each day. Each call to
CustomFactor.compute
will only receive assets for whichmask
produced True on the day for which compute is being called.
Notes
Users implementing their own Factors should subclass CustomFactor and implement a method named compute with the following signature:
def compute(self, today, assets, out, *inputs): ...
On each simulation date,
compute
will be called with the current date, an array of sids, an output array, and an input array for each expression passed as inputs to the CustomFactor constructor.The specific types of the values passed to compute are as follows:
today : np.datetime64[ns] Row label for the last row of all arrays passed as `inputs`. assets : np.array[int64, ndim=1] Column labels for `out` and`inputs`. out : np.array[self.dtype, ndim=1] Output array of the same shape as `assets`. `compute` should write its desired return values into `out`. If multiple outputs are specified, `compute` should write its desired return values into `out.<output_name>` for each output name in `self.outputs`. *inputs : tuple of np.array Raw data arrays corresponding to the values of `self.inputs`.
compute
functions should expect to be passed NaN values for dates on which no data was available for an asset. This may include dates on which an asset did not yet exist.For example, if a CustomFactor requires 10 rows of close price data, and asset A started trading on Monday June 2nd, 2014, then on Tuesday, June 3rd, 2014, the column of input data for asset A will have 9 leading NaNs for the preceding days on which data was not yet available.
Examples
A CustomFactor with pre-declared defaults:
class TenDayRange(CustomFactor): """ Computes the difference between the highest high in the last 10 days and the lowest low. Pre-declares high and low as default inputs and `window_length` as 10. """ inputs = [USEquityPricing.high, USEquityPricing.low] window_length = 10 def compute(self, today, assets, out, highs, lows): from numpy import nanmin, nanmax highest_highs = nanmax(highs, axis=0) lowest_lows = nanmin(lows, axis=0) out[:] = highest_highs - lowest_lows # Doesn't require passing inputs or window_length because they're # pre-declared as defaults for the TenDayRange class. ten_day_range = TenDayRange()
A CustomFactor without defaults:
class MedianValue(CustomFactor): """ Computes the median value of an arbitrary single input over an arbitrary window.. Does not declare any defaults, so values for `window_length` and `inputs` must be passed explicitly on every construction. """ def compute(self, today, assets, out, data): from numpy import nanmedian out[:] = data.nanmedian(data, axis=0) # Values for `inputs` and `window_length` must be passed explicitly to # MedianValue. median_close10 = MedianValue([USEquityPricing.close], window_length=10) median_low15 = MedianValue([USEquityPricing.low], window_length=15)
A CustomFactor with multiple outputs:
class MultipleOutputs(CustomFactor): inputs = [USEquityPricing.close] outputs = ['alpha', 'beta'] window_length = N def compute(self, today, assets, out, close): computed_alpha, computed_beta = some_function(close) out.alpha[:] = computed_alpha out.beta[:] = computed_beta # Each output is returned as its own Factor upon instantiation. alpha, beta = MultipleOutputs() # Equivalently, we can create a single factor instance and access each # output as an attribute of that instance. multiple_outputs = MultipleOutputs() alpha = multiple_outputs.alpha beta = multiple_outputs.beta
Note: If a CustomFactor has multiple outputs, all outputs must have the same dtype. For instance, in the example above, if alpha is a float then beta must also be a float.
-
class
zipline.pipeline.
Filter
(inputs=sentinel('NotSpecified'), outputs=sentinel('NotSpecified'), window_length=sentinel('NotSpecified'), mask=sentinel('NotSpecified'), domain=sentinel('NotSpecified'), *args, **kwargs)[source]¶ Pipeline expression computing a boolean output.
Filters are most commonly useful for describing sets of assets to include or exclude for some particular purpose. Many Pipeline API functions accept a
mask
argument, which can be supplied a Filter indicating that only values passing the Filter should be considered when performing the requested computation. For example,zipline.pipeline.Factor.top()
accepts a mask indicating that ranks should be computed only on assets that passed the specified Filter.The most common way to construct a Filter is via one of the comparison operators (
<
,<=
,!=
,eq
,>
,>=
) ofFactor
. For example, a natural way to construct a Filter for stocks with a 10-day VWAP less than $20.0 is to first construct a Factor computing 10-day VWAP and compare it to the scalar value 20.0:>>> from zipline.pipeline.factors import VWAP >>> vwap_10 = VWAP(window_length=10) >>> vwaps_under_20 = (vwap_10 <= 20)
Filters can also be constructed via comparisons between two Factors. For example, to construct a Filter producing True for asset/date pairs where the asset’s 10-day VWAP was greater than it’s 30-day VWAP:
>>> short_vwap = VWAP(window_length=10) >>> long_vwap = VWAP(window_length=30) >>> higher_short_vwap = (short_vwap > long_vwap)
Filters can be combined via the
&
(and) and|
(or) operators.&
-ing together two filters produces a new Filter that produces True if both of the inputs produced True.|
-ing together two filters produces a new Filter that produces True if either of its inputs produced True.The
~
operator can be used to invert a Filter, swapping all True values with Falses and vice-versa.Filters may be set as the
screen
attribute of a Pipeline, indicating asset/date pairs for which the filter produces False should be excluded from the Pipeline’s output. This is useful both for reducing noise in the output of a Pipeline and for reducing memory consumption of Pipeline results.-
__and__
(other)¶ Binary Operator: ‘&’
-
__or__
(other)¶ Binary Operator: ‘|’
-
if_else
(if_true, if_false)[source]¶ Create a term that selects values from one of two choices.
- Parameters
if_true (zipline.pipeline.term.ComputableTerm) – Expression whose values should be used at locations where this filter outputs True.
if_false (zipline.pipeline.term.ComputableTerm) – Expression whose values should be used at locations where this filter outputs False.
- Returns
merged – A term that computes by taking values from either
if_true
orif_false
, depending on the values produced byself
.The returned term draws from``if_true`` at locations where
self
produces True, and it draws fromif_false
at locations whereself
produces False.- Return type
zipline.pipeline.term.ComputableTerm
Example
Let
f
be a Factor that produces the following output:AAPL MSFT MCD BK 2017-03-13 1.0 2.0 3.0 4.0 2017-03-14 5.0 6.0 7.0 8.0
Let
g
be another Factor that produces the following output:AAPL MSFT MCD BK 2017-03-13 10.0 20.0 30.0 40.0 2017-03-14 50.0 60.0 70.0 80.0
Finally, let
condition
be a Filter that produces the following output:AAPL MSFT MCD BK 2017-03-13 True False True False 2017-03-14 True True False False
Then, the expression
condition.if_else(f, g)
produces the following output:AAPL MSFT MCD BK 2017-03-13 1.0 20.0 3.0 40.0 2017-03-14 5.0 6.0 70.0 80.0
See also
numpy.where()
,Factor.fillna()
-
-
class
zipline.pipeline.
Factor
(inputs=sentinel('NotSpecified'), outputs=sentinel('NotSpecified'), window_length=sentinel('NotSpecified'), mask=sentinel('NotSpecified'), domain=sentinel('NotSpecified'), *args, **kwargs)[source]¶ Pipeline API expression producing a numerical or date-valued output.
Factors are the most commonly-used Pipeline term, representing the result of any computation producing a numerical result.
Factors can be combined, both with other Factors and with scalar values, via any of the builtin mathematical operators (
+
,-
,*
, etc).This makes it easy to write complex expressions that combine multiple Factors. For example, constructing a Factor that computes the average of two other Factors is simply:
>>> f1 = SomeFactor(...) >>> f2 = SomeOtherFactor(...) >>> average = (f1 + f2) / 2.0
Factors can also be converted into
zipline.pipeline.Filter
objects via comparison operators: (<
,<=
,!=
,eq
,>
,>=
).There are many natural operators defined on Factors besides the basic numerical operators. These include methods for identifying missing or extreme-valued outputs (
isnull()
,notnull()
,isnan()
,notnan()
), methods for normalizing outputs (rank()
,demean()
,zscore()
), and methods for constructing Filters based on rank-order properties of results (top()
,bottom()
,percentile_between()
).-
eq
(other)¶ Construct a
Filter
computingself == other
.- Parameters
other (zipline.pipeline.Factor, float) – Right-hand side of the expression.
- Returns
filter – Filter computing
self == other
with the outputs ofself
andother
.- Return type
-
demean
(mask=sentinel('NotSpecified'), groupby=sentinel('NotSpecified'))[source]¶ Construct a Factor that computes
self
and subtracts the mean from row of the result.If
mask
is supplied, ignore values wheremask
returns False when computing row means, and output NaN anywhere the mask is False.If
groupby
is supplied, compute by partitioning each row based on the values produced bygroupby
, de-meaning the partitioned arrays, and stitching the sub-results back together.- Parameters
mask (zipline.pipeline.Filter, optional) – A Filter defining values to ignore when computing means.
groupby (zipline.pipeline.Classifier, optional) – A classifier defining partitions over which to compute means.
Examples
Let
f
be a Factor which would produce the following output:AAPL MSFT MCD BK 2017-03-13 1.0 2.0 3.0 4.0 2017-03-14 1.5 2.5 3.5 1.0 2017-03-15 2.0 3.0 4.0 1.5 2017-03-16 2.5 3.5 1.0 2.0
Let
c
be a Classifier producing the following output:AAPL MSFT MCD BK 2017-03-13 1 1 2 2 2017-03-14 1 1 2 2 2017-03-15 1 1 2 2 2017-03-16 1 1 2 2
Let
m
be a Filter producing the following output:AAPL MSFT MCD BK 2017-03-13 False True True True 2017-03-14 True False True True 2017-03-15 True True False True 2017-03-16 True True True False
Then
f.demean()
will subtract the mean from each row produced byf
.AAPL MSFT MCD BK 2017-03-13 -1.500 -0.500 0.500 1.500 2017-03-14 -0.625 0.375 1.375 -1.125 2017-03-15 -0.625 0.375 1.375 -1.125 2017-03-16 0.250 1.250 -1.250 -0.250
f.demean(mask=m)
will subtract the mean from each row, but means will be calculated ignoring values on the diagonal, and NaNs will written to the diagonal in the output. Diagonal values are ignored because they are the locations where the maskm
produced False.AAPL MSFT MCD BK 2017-03-13 NaN -1.000 0.000 1.000 2017-03-14 -0.500 NaN 1.500 -1.000 2017-03-15 -0.166 0.833 NaN -0.666 2017-03-16 0.166 1.166 -1.333 NaN
f.demean(groupby=c)
will subtract the group-mean of AAPL/MSFT and MCD/BK from their respective entries. The AAPL/MSFT are grouped together because both assets always produce 1 in the output of the classifierc
. Similarly, MCD/BK are grouped together because they always produce 2.AAPL MSFT MCD BK 2017-03-13 -0.500 0.500 -0.500 0.500 2017-03-14 -0.500 0.500 1.250 -1.250 2017-03-15 -0.500 0.500 1.250 -1.250 2017-03-16 -0.500 0.500 -0.500 0.500
f.demean(mask=m, groupby=c)
will also subtract the group-mean of AAPL/MSFT and MCD/BK, but means will be calculated ignoring values on the diagonal , and NaNs will be written to the diagonal in the output.AAPL MSFT MCD BK 2017-03-13 NaN 0.000 -0.500 0.500 2017-03-14 0.000 NaN 1.250 -1.250 2017-03-15 -0.500 0.500 NaN 0.000 2017-03-16 -0.500 0.500 0.000 NaN
Notes
Mean is sensitive to the magnitudes of outliers. When working with factor that can potentially produce large outliers, it is often useful to use the
mask
parameter to discard values at the extremes of the distribution:>>> base = MyFactor(...) >>> normalized = base.demean( ... mask=base.percentile_between(1, 99), ... )
demean()
is only supported on Factors of dtype float64.See also
-
zscore
(mask=sentinel('NotSpecified'), groupby=sentinel('NotSpecified'))[source]¶ Construct a Factor that Z-Scores each day’s results.
The Z-Score of a row is defined as:
(row - row.mean()) / row.stddev()
If
mask
is supplied, ignore values wheremask
returns False when computing row means and standard deviations, and output NaN anywhere the mask is False.If
groupby
is supplied, compute by partitioning each row based on the values produced bygroupby
, z-scoring the partitioned arrays, and stitching the sub-results back together.- Parameters
mask (zipline.pipeline.Filter, optional) – A Filter defining values to ignore when Z-Scoring.
groupby (zipline.pipeline.Classifier, optional) – A classifier defining partitions over which to compute Z-Scores.
- Returns
zscored – A Factor producing that z-scores the output of self.
- Return type
Notes
Mean and standard deviation are sensitive to the magnitudes of outliers. When working with factor that can potentially produce large outliers, it is often useful to use the
mask
parameter to discard values at the extremes of the distribution:>>> base = MyFactor(...) >>> normalized = base.zscore( ... mask=base.percentile_between(1, 99), ... )
zscore()
is only supported on Factors of dtype float64.Examples
See
demean()
for an in-depth example of the semantics formask
andgroupby
.See also
-
rank
(method='ordinal', ascending=True, mask=sentinel('NotSpecified'), groupby=sentinel('NotSpecified'))[source]¶ Construct a new Factor representing the sorted rank of each column within each row.
- Parameters
method (str, {'ordinal', 'min', 'max', 'dense', 'average'}) – The method used to assign ranks to tied elements. See scipy.stats.rankdata for a full description of the semantics for each ranking method. Default is ‘ordinal’.
ascending (bool, optional) – Whether to return sorted rank in ascending or descending order. Default is True.
mask (zipline.pipeline.Filter, optional) – A Filter representing assets to consider when computing ranks. If mask is supplied, ranks are computed ignoring any asset/date pairs for which mask produces a value of False.
groupby (zipline.pipeline.Classifier, optional) – A classifier defining partitions over which to perform ranking.
- Returns
ranks – A new factor that will compute the ranking of the data produced by self.
- Return type
Notes
The default value for method is different from the default for scipy.stats.rankdata. See that function’s documentation for a full description of the valid inputs to method.
Missing or non-existent data on a given day will cause an asset to be given a rank of NaN for that day.
See also
-
pearsonr
(target, correlation_length, mask=sentinel('NotSpecified'))[source]¶ Construct a new Factor that computes rolling pearson correlation coefficients between
target
and the columns ofself
.- Parameters
target (zipline.pipeline.Term) – The term used to compute correlations against each column of data produced by self. This may be a Factor, a BoundColumn or a Slice. If target is two-dimensional, correlations are computed asset-wise.
correlation_length (int) – Length of the lookback window over which to compute each correlation coefficient.
mask (zipline.pipeline.Filter, optional) – A Filter describing which assets should have their correlation with the target slice computed each day.
- Returns
correlations – A new Factor that will compute correlations between
target
and the columns ofself
.- Return type
Notes
This method can only be called on expressions which are deemed safe for use as inputs to windowed
Factor
objects. Examples of such expressions include This includesBoundColumn
Returns
and any factors created fromrank()
orzscore()
.Examples
Suppose we want to create a factor that computes the correlation between AAPL’s 10-day returns and the 10-day returns of all other assets, computing each correlation over 30 days. This can be achieved by doing the following:
returns = Returns(window_length=10) returns_slice = returns[sid(24)] aapl_correlations = returns.pearsonr( target=returns_slice, correlation_length=30, )
This is equivalent to doing:
aapl_correlations = RollingPearsonOfReturns( target=sid(24), returns_length=10, correlation_length=30, )
-
spearmanr
(target, correlation_length, mask=sentinel('NotSpecified'))[source]¶ Construct a new Factor that computes rolling spearman rank correlation coefficients between
target
and the columns ofself
.- Parameters
target (zipline.pipeline.Term) – The term used to compute correlations against each column of data produced by self. This may be a Factor, a BoundColumn or a Slice. If target is two-dimensional, correlations are computed asset-wise.
correlation_length (int) – Length of the lookback window over which to compute each correlation coefficient.
mask (zipline.pipeline.Filter, optional) – A Filter describing which assets should have their correlation with the target slice computed each day.
- Returns
correlations – A new Factor that will compute correlations between
target
and the columns ofself
.- Return type
Notes
This method can only be called on expressions which are deemed safe for use as inputs to windowed
Factor
objects. Examples of such expressions include This includesBoundColumn
Returns
and any factors created fromrank()
orzscore()
.Examples
Suppose we want to create a factor that computes the correlation between AAPL’s 10-day returns and the 10-day returns of all other assets, computing each correlation over 30 days. This can be achieved by doing the following:
returns = Returns(window_length=10) returns_slice = returns[sid(24)] aapl_correlations = returns.spearmanr( target=returns_slice, correlation_length=30, )
This is equivalent to doing:
aapl_correlations = RollingSpearmanOfReturns( target=sid(24), returns_length=10, correlation_length=30, )
See also
-
linear_regression
(target, regression_length, mask=sentinel('NotSpecified'))[source]¶ Construct a new Factor that performs an ordinary least-squares regression predicting the columns of self from target.
- Parameters
target (zipline.pipeline.Term) – The term to use as the predictor/independent variable in each regression. This may be a Factor, a BoundColumn or a Slice. If target is two-dimensional, regressions are computed asset-wise.
regression_length (int) – Length of the lookback window over which to compute each regression.
mask (zipline.pipeline.Filter, optional) – A Filter describing which assets should be regressed with the target slice each day.
- Returns
regressions – A new Factor that will compute linear regressions of target against the columns of self.
- Return type
Notes
This method can only be called on expressions which are deemed safe for use as inputs to windowed
Factor
objects. Examples of such expressions include This includesBoundColumn
Returns
and any factors created fromrank()
orzscore()
.Examples
Suppose we want to create a factor that regresses AAPL’s 10-day returns against the 10-day returns of all other assets, computing each regression over 30 days. This can be achieved by doing the following:
returns = Returns(window_length=10) returns_slice = returns[sid(24)] aapl_regressions = returns.linear_regression( target=returns_slice, regression_length=30, )
This is equivalent to doing:
aapl_regressions = RollingLinearRegressionOfReturns( target=sid(24), returns_length=10, regression_length=30, )
See also
-
winsorize
(min_percentile, max_percentile, mask=sentinel('NotSpecified'), groupby=sentinel('NotSpecified'))[source]¶ Construct a new factor that winsorizes the result of this factor.
Winsorizing changes values ranked less than the minimum percentile to the value at the minimum percentile. Similarly, values ranking above the maximum percentile are changed to the value at the maximum percentile.
Winsorizing is useful for limiting the impact of extreme data points without completely removing those points.
If
mask
is supplied, ignore values wheremask
returns False when computing percentile cutoffs, and output NaN anywhere the mask is False.If
groupby
is supplied, winsorization is applied separately separately to each group defined bygroupby
.- Parameters
min_percentile (float, int) – Entries with values at or below this percentile will be replaced with the (len(input) * min_percentile)th lowest value. If low values should not be clipped, use 0.
max_percentile (float, int) – Entries with values at or above this percentile will be replaced with the (len(input) * max_percentile)th lowest value. If high values should not be clipped, use 1.
mask (zipline.pipeline.Filter, optional) – A Filter defining values to ignore when winsorizing.
groupby (zipline.pipeline.Classifier, optional) – A classifier defining partitions over which to winsorize.
- Returns
winsorized – A Factor producing a winsorized version of self.
- Return type
Examples
price = USEquityPricing.close.latest columns={ 'PRICE': price, 'WINSOR_1: price.winsorize( min_percentile=0.25, max_percentile=0.75 ), 'WINSOR_2': price.winsorize( min_percentile=0.50, max_percentile=1.0 ), 'WINSOR_3': price.winsorize( min_percentile=0.0, max_percentile=0.5 ), }
Given a pipeline with columns, defined above, the result for a given day could look like:
'PRICE' 'WINSOR_1' 'WINSOR_2' 'WINSOR_3' Asset_1 1 2 4 3 Asset_2 2 2 4 3 Asset_3 3 3 4 3 Asset_4 4 4 4 4 Asset_5 5 5 5 4 Asset_6 6 5 5 4
-
quantiles
(bins, mask=sentinel('NotSpecified'))[source]¶ Construct a Classifier computing quantiles of the output of
self
.Every non-NaN data point the output is labelled with an integer value from 0 to (bins - 1). NaNs are labelled with -1.
If
mask
is supplied, ignore data points in locations for whichmask
produces False, and emit a label of -1 at those locations.- Parameters
bins (int) – Number of bins labels to compute.
mask (zipline.pipeline.Filter, optional) – Mask of values to ignore when computing quantiles.
- Returns
quantiles – A classifier producing integer labels ranging from 0 to (bins - 1).
- Return type
zipline.pipeline.Classifier
-
quartiles
(mask=sentinel('NotSpecified'))[source]¶ Construct a Classifier computing quartiles over the output of
self
.Every non-NaN data point the output is labelled with a value of either 0, 1, 2, or 3, corresponding to the first, second, third, or fourth quartile over each row. NaN data points are labelled with -1.
If
mask
is supplied, ignore data points in locations for whichmask
produces False, and emit a label of -1 at those locations.- Parameters
mask (zipline.pipeline.Filter, optional) – Mask of values to ignore when computing quartiles.
- Returns
quartiles – A classifier producing integer labels ranging from 0 to 3.
- Return type
zipline.pipeline.Classifier
-
quintiles
(mask=sentinel('NotSpecified'))[source]¶ Construct a Classifier computing quintile labels on
self
.Every non-NaN data point the output is labelled with a value of either 0, 1, 2, or 3, 4, corresonding to quintiles over each row. NaN data points are labelled with -1.
If
mask
is supplied, ignore data points in locations for whichmask
produces False, and emit a label of -1 at those locations.- Parameters
mask (zipline.pipeline.Filter, optional) – Mask of values to ignore when computing quintiles.
- Returns
quintiles – A classifier producing integer labels ranging from 0 to 4.
- Return type
zipline.pipeline.Classifier
-
deciles
(mask=sentinel('NotSpecified'))[source]¶ Construct a Classifier computing decile labels on
self
.Every non-NaN data point the output is labelled with a value from 0 to 9 corresonding to deciles over each row. NaN data points are labelled with -1.
If
mask
is supplied, ignore data points in locations for whichmask
produces False, and emit a label of -1 at those locations.- Parameters
mask (zipline.pipeline.Filter, optional) – Mask of values to ignore when computing deciles.
- Returns
deciles – A classifier producing integer labels ranging from 0 to 9.
- Return type
zipline.pipeline.Classifier
-
top
(N, mask=sentinel('NotSpecified'), groupby=sentinel('NotSpecified'))[source]¶ Construct a Filter matching the top N asset values of self each day.
If
groupby
is supplied, returns a Filter matching the top N asset values for each group.- Parameters
N (int) – Number of assets passing the returned filter each day.
mask (zipline.pipeline.Filter, optional) – A Filter representing assets to consider when computing ranks. If mask is supplied, top values are computed ignoring any asset/date pairs for which mask produces a value of False.
groupby (zipline.pipeline.Classifier, optional) – A classifier defining partitions over which to perform ranking.
- Returns
filter
- Return type
-
bottom
(N, mask=sentinel('NotSpecified'), groupby=sentinel('NotSpecified'))[source]¶ Construct a Filter matching the bottom N asset values of self each day.
If
groupby
is supplied, returns a Filter matching the bottom N asset values for each group defined bygroupby
.- Parameters
N (int) – Number of assets passing the returned filter each day.
mask (zipline.pipeline.Filter, optional) – A Filter representing assets to consider when computing ranks. If mask is supplied, bottom values are computed ignoring any asset/date pairs for which mask produces a value of False.
groupby (zipline.pipeline.Classifier, optional) – A classifier defining partitions over which to perform ranking.
- Returns
filter
- Return type
-
percentile_between
(min_percentile, max_percentile, mask=sentinel('NotSpecified'))[source]¶ Construct a Filter matching values of self that fall within the range defined by
min_percentile
andmax_percentile
.- Parameters
min_percentile (float [0.0, 100.0]) – Return True for assets falling above this percentile in the data.
max_percentile (float [0.0, 100.0]) – Return True for assets falling below this percentile in the data.
mask (zipline.pipeline.Filter, optional) – A Filter representing assets to consider when percentile calculating thresholds. If mask is supplied, percentile cutoffs are computed each day using only assets for which
mask
returns True. Assets for whichmask
produces False will produce False in the output of this Factor as well.
- Returns
out – A new filter that will compute the specified percentile-range mask.
- Return type
-
isnan
()[source]¶ A Filter producing True for all values where this Factor is NaN.
- Returns
nanfilter
- Return type
-
notnan
()[source]¶ A Filter producing True for values where this Factor is not NaN.
- Returns
nanfilter
- Return type
-
isfinite
()[source]¶ A Filter producing True for values where this Factor is anything but NaN, inf, or -inf.
-
clip
(min_bound, max_bound, mask=sentinel('NotSpecified'))[source]¶ Clip (limit) the values in a factor.
Given an interval, values outside the interval are clipped to the interval edges. For example, if an interval of
[0, 1]
is specified, values smaller than 0 become 0, and values larger than 1 become 1.- Parameters
min_bound (float) – The minimum value to use.
max_bound (float) – The maximum value to use.
mask (zipline.pipeline.Filter, optional) – A Filter representing assets to consider when clipping.
Notes
To only clip values on one side,
-np.inf` and ``np.inf
may be passed. For example, to only clip the maximum value but not clip a minimum value:factor.clip(min_bound=-np.inf, max_bound=user_provided_max)
See also
numpy.clip()
-
clip
(min_bound, max_bound, mask=sentinel('NotSpecified'))[source]¶ Clip (limit) the values in a factor.
Given an interval, values outside the interval are clipped to the interval edges. For example, if an interval of
[0, 1]
is specified, values smaller than 0 become 0, and values larger than 1 become 1.- Parameters
min_bound (float) – The minimum value to use.
max_bound (float) – The maximum value to use.
mask (zipline.pipeline.Filter, optional) – A Filter representing assets to consider when clipping.
Notes
To only clip values on one side,
-np.inf` and ``np.inf
may be passed. For example, to only clip the maximum value but not clip a minimum value:factor.clip(min_bound=-np.inf, max_bound=user_provided_max)
See also
numpy.clip()
-
__add__
(other)¶ Construct a
Factor
computingself + other
.- Parameters
other (zipline.pipeline.Factor, float) – Right-hand side of the expression.
- Returns
factor – Factor computing
self + other
with outputs ofself
andother
.- Return type
-
__sub__
(other)¶ Construct a
Factor
computingself - other
.- Parameters
other (zipline.pipeline.Factor, float) – Right-hand side of the expression.
- Returns
factor – Factor computing
self - other
with outputs ofself
andother
.- Return type
-
__mul__
(other)¶ Construct a
Factor
computingself * other
.- Parameters
other (zipline.pipeline.Factor, float) – Right-hand side of the expression.
- Returns
factor – Factor computing
self * other
with outputs ofself
andother
.- Return type
-
__div__
(other)¶ Construct a
Factor
computingself / other
.- Parameters
other (zipline.pipeline.Factor, float) – Right-hand side of the expression.
- Returns
factor – Factor computing
self / other
with outputs ofself
andother
.- Return type
-
__mod__
(other)¶ Construct a
Factor
computingself % other
.- Parameters
other (zipline.pipeline.Factor, float) – Right-hand side of the expression.
- Returns
factor – Factor computing
self % other
with outputs ofself
andother
.- Return type
-
__pow__
(other)¶ Construct a
Factor
computingself ** other
.- Parameters
other (zipline.pipeline.Factor, float) – Right-hand side of the expression.
- Returns
factor – Factor computing
self ** other
with outputs ofself
andother
.- Return type
-
__lt__
(other)¶ Construct a
Filter
computingself < other
.- Parameters
other (zipline.pipeline.Factor, float) – Right-hand side of the expression.
- Returns
filter – Filter computing
self < other
with the outputs ofself
andother
.- Return type
-
__le__
(other)¶ Construct a
Filter
computingself <= other
.- Parameters
other (zipline.pipeline.Factor, float) – Right-hand side of the expression.
- Returns
filter – Filter computing
self <= other
with the outputs ofself
andother
.- Return type
-
__ne__
(other)¶ Construct a
Filter
computingself != other
.- Parameters
other (zipline.pipeline.Factor, float) – Right-hand side of the expression.
- Returns
filter – Filter computing
self != other
with the outputs ofself
andother
.- Return type
-
__ge__
(other)¶ Construct a
Filter
computingself >= other
.- Parameters
other (zipline.pipeline.Factor, float) – Right-hand side of the expression.
- Returns
filter – Filter computing
self >= other
with the outputs ofself
andother
.- Return type
-
__gt__
(other)¶ Construct a
Filter
computingself > other
.- Parameters
other (zipline.pipeline.Factor, float) – Right-hand side of the expression.
- Returns
filter – Filter computing
self > other
with the outputs ofself
andother
.- Return type
-
fillna
(fill_value)¶ Create a new term that fills missing values of this term’s output with
fill_value
.- Parameters
fill_value (zipline.pipeline.ComputableTerm, or object.) –
Object to use as replacement for missing values.
If a ComputableTerm (e.g. a Factor) is passed, that term’s results will be used as fill values.
If a scalar (e.g. a number) is passed, the scalar will be used as a fill value.
Examples
Filling with a Scalar:
Let
f
be a Factor which would produce the following output:AAPL MSFT MCD BK 2017-03-13 1.0 NaN 3.0 4.0 2017-03-14 1.5 2.5 NaN NaN
Then
f.fillna(0)
produces the following output:AAPL MSFT MCD BK 2017-03-13 1.0 0.0 3.0 4.0 2017-03-14 1.5 2.5 0.0 0.0
Filling with a Term:
Let
f
be as above, and letg
be another Factor which would produce the following output:AAPL MSFT MCD BK 2017-03-13 10.0 20.0 30.0 40.0 2017-03-14 15.0 25.0 35.0 45.0
Then,
f.fillna(g)
produces the following output:AAPL MSFT MCD BK 2017-03-13 1.0 20.0 3.0 4.0 2017-03-14 1.5 2.5 35.0 45.0
- Returns
filled – A term computing the same results as
self
, but with missing values filled in using values fromfill_value
.- Return type
zipline.pipeline.ComputableTerm
-
mean
(mask=sentinel('NotSpecified'))¶ Create a 1-dimensional factor computing the mean of self, each day.
- Parameters
mask (zipline.pipeline.Filter, optional) – A Filter representing assets to consider when computing results. If supplied, we ignore asset/date pairs where
mask
producesFalse
.- Returns
result
- Return type
-
stddev
(mask=sentinel('NotSpecified'))¶ Create a 1-dimensional factor computing the stddev of self, each day.
- Parameters
mask (zipline.pipeline.Filter, optional) – A Filter representing assets to consider when computing results. If supplied, we ignore asset/date pairs where
mask
producesFalse
.- Returns
result
- Return type
-
max
(mask=sentinel('NotSpecified'))¶ Create a 1-dimensional factor computing the max of self, each day.
- Parameters
mask (zipline.pipeline.Filter, optional) – A Filter representing assets to consider when computing results. If supplied, we ignore asset/date pairs where
mask
producesFalse
.- Returns
result
- Return type
-
min
(mask=sentinel('NotSpecified'))¶ Create a 1-dimensional factor computing the min of self, each day.
- Parameters
mask (zipline.pipeline.Filter, optional) – A Filter representing assets to consider when computing results. If supplied, we ignore asset/date pairs where
mask
producesFalse
.- Returns
result
- Return type
-
median
(mask=sentinel('NotSpecified'))¶ Create a 1-dimensional factor computing the median of self, each day.
- Parameters
mask (zipline.pipeline.Filter, optional) – A Filter representing assets to consider when computing results. If supplied, we ignore asset/date pairs where
mask
producesFalse
.- Returns
result
- Return type
-
sum
(mask=sentinel('NotSpecified'))¶ Create a 1-dimensional factor computing the sum of self, each day.
- Parameters
mask (zipline.pipeline.Filter, optional) – A Filter representing assets to consider when computing results. If supplied, we ignore asset/date pairs where
mask
producesFalse
.- Returns
result
- Return type
-
-
class
zipline.pipeline.
Term
(domain=sentinel('NotSpecified'), dtype=sentinel('NotSpecified'), missing_value=sentinel('NotSpecified'), window_safe=sentinel('NotSpecified'), ndim=sentinel('NotSpecified'), *args, **kwargs)[source]¶ Base class for objects that can appear in the compute graph of a
zipline.pipeline.Pipeline
.Notes
Most Pipeline API users only interact with
Term
via subclasses:Classifier
Instances of
Term
are memoized. If you call a Term’s constructor with the same arguments twice, the same object will be returned from both calls:Example:
>>> from zipline.pipeline.data import EquityPricing >>> from zipline.pipeline.factors import SimpleMovingAverage >>> x = SimpleMovingAverage(inputs=[EquityPricing.close], window_length=5) >>> y = SimpleMovingAverage(inputs=[EquityPricing.close], window_length=5) >>> x is y True
Warning
Memoization of terms means that it’s generally unsafe to modify attributes of a term after construction.
-
class
zipline.pipeline.data.
DataSet
[source]¶ Base class for Pipeline datasets.
A
DataSet
is defined by two parts:A collection of
Column
objects that describe the queryable attributes of the dataset.A
Domain
describing the assets and calendar of the data represented by theDataSet
.
To create a new Pipeline dataset, define a subclass of
DataSet
and set one or moreColumn
objects as class-level attributes. Each column requires anp.dtype
that describes the type of data that should be produced by a loader for the dataset. Integer columns must also provide a “missing value” to be used when no value is available for a given asset/date combination.By default, the domain of a dataset is the special singleton value,
GENERIC
, which means that they can be used in a Pipeline running on any domain.In some cases, it may be preferable to restrict a dataset to only allow support a single domain. For example, a DataSet may describe data from a vendor that only covers the US. To restrict a dataset to a specific domain, define a domain attribute at class scope.
You can also define a domain-specific version of a generic DataSet by calling its
specialize
method with the domain of interest.Examples
The built-in EquityPricing dataset is defined as follows:
class EquityPricing(DataSet): open = Column(float) high = Column(float) low = Column(float) close = Column(float) volume = Column(float)
The built-in USEquityPricing dataset is a specialization of EquityPricing. It is defined as:
from zipline.pipeline.domain import US_EQUITIES USEquityPricing = EquityPricing.specialize(US_EQUITIES)
Columns can have types other than float. A dataset containing assorted company metadata might be defined like this:
class CompanyMetadata(DataSet): # Use float for semantically-numeric data, even if it's always # integral valued (see Notes section below). The default missing # value for floats is NaN. shares_outstanding = Column(float) # Use object for string columns. The default missing value for # object-dtype columns is None. ticker = Column(object) # Use integers for integer-valued categorical data like sector or # industry codes. Integer-dtype columns require an explicit missing # value. sector_code = Column(int, missing_value=-1) # Use bool for boolean-valued flags. Note that the default missing # value for bool-dtype columns is False. is_primary_share = Column(bool)
Notes
Because numpy has no native support for integers with missing values, users are strongly encouraged to use floats for any data that’s semantically numeric. Doing so enables the use of NaN as a natural missing value, which has useful propagation semantics.
-
classmethod
get_column
(name)[source]¶ Look up a column by name.
- Parameters
name (str) – Name of the column to look up.
- Returns
column – Column with the given name.
- Return type
- Raises
AttributeError – If no column with the given name exists.
-
class
zipline.pipeline.data.
Column
(dtype, missing_value=sentinel('NotSpecified'), doc=None, metadata=None, currency_aware=False)[source]¶ An abstract column of data, not yet associated with a dataset.
-
class
zipline.pipeline.data.
BoundColumn
(dtype, missing_value, dataset, name, doc, metadata, currency_conversion, currency_aware)[source]¶ A column of data that’s been concretely bound to a particular dataset.
-
dtype
¶ The dtype of data produced when this column is loaded.
- Type
-
latest
¶ A
Filter
,Factor
, orClassifier
computing the most recently known value of this column on each date. Seezipline.pipeline.mixins.LatestMixin
for more details.- Type
zipline.pipeline.LoadableTerm
-
dataset
¶ The dataset to which this column is bound.
Notes
Instances of this class are dynamically created upon access to attributes of
DataSet
. For example,close
is an instance of this class. Pipeline API users should never construct instances of this directly.-
property
currency_aware
¶ Whether or not this column produces currency-denominated data.
-
property
currency_conversion
¶ Specification for currency conversions applied for this term.
-
property
dataset
¶ The dataset to which this column is bound.
-
fx
(currency)[source]¶ Construct a currency-converted version of this column.
- Parameters
currency (str or zipline.currency.Currency) – Currency into which to convert this column’s data.
- Returns
column – Column producing the same data as
self
, but currency-converted intocurrency
.- Return type
-
property
metadata
¶ A copy of the metadata for this column.
-
property
name
¶ The name of this column.
-
property
qualname
¶ The fully-qualified name of this column.
-
-
class
zipline.pipeline.data.
DataSetFamily
[source]¶ Base class for Pipeline dataset families.
Dataset families are used to represent data where the unique identifier for a row requires more than just asset and date coordinates. A
DataSetFamily
can also be thought of as a collection ofDataSet
objects, each of which has the same columns, domain, and ndim.DataSetFamily
objects are defined with one or moreColumn
objects, plus one additional field:extra_dims
.The
extra_dims
field defines coordinates other than asset and date that must be fixed to produce a logical timeseries. The column objects determine columns that will be shared by slices of the family.extra_dims
are represented as an ordered dictionary where the keys are the dimension name, and the values are a set of unique values along that dimension.To work with a
DataSetFamily
in a pipeline expression, one must choose a specific value for each of the extra dimensions using theslice()
method. For example, given aDataSetFamily
:class SomeDataSet(DataSetFamily): extra_dims = [ ('dimension_0', {'a', 'b', 'c'}), ('dimension_1', {'d', 'e', 'f'}), ] column_0 = Column(float) column_1 = Column(bool)
This dataset might represent a table with the following columns:
sid :: int64 asof_date :: datetime64[ns] timestamp :: datetime64[ns] dimension_0 :: str dimension_1 :: str column_0 :: float64 column_1 :: bool
Here we see the implicit
sid
,asof_date
andtimestamp
columns as well as the extra dimensions columns.This
DataSetFamily
can be converted to a regularDataSet
with:DataSetSlice = SomeDataSet.slice(dimension_0='a', dimension_1='e')
This sliced dataset represents the rows from the higher dimensional dataset where
(dimension_0 == 'a') & (dimension_1 == 'e')
.-
classmethod
slice
(*args, **kwargs)[source]¶ Take a slice of a DataSetFamily to produce a dataset indexed by asset and date.
- Parameters
*args –
**kwargs – The coordinates to fix along each extra dimension.
- Returns
dataset – A regular pipeline dataset indexed by asset and date.
- Return type
Notes
The extra dimensions coords used to produce the result are available under the
extra_coords
attribute.
-
classmethod
-
class
zipline.pipeline.data.
EquityPricing
[source]¶ DataSet
containing daily trading prices and volumes.-
close
= EquityPricing.close::float64¶
-
high
= EquityPricing.high::float64¶
-
low
= EquityPricing.low::float64¶
-
open
= EquityPricing.open::float64¶
-
volume
= EquityPricing.volume::float64¶
-
Built-in Factors¶
-
class
zipline.pipeline.factors.
AverageDollarVolume
(inputs=sentinel('NotSpecified'), outputs=sentinel('NotSpecified'), window_length=sentinel('NotSpecified'), mask=sentinel('NotSpecified'), dtype=sentinel('NotSpecified'), missing_value=sentinel('NotSpecified'), ndim=sentinel('NotSpecified'), **kwargs)[source]¶ Average Daily Dollar Volume
Default Inputs: [EquityPricing.close, EquityPricing.volume]
Default Window Length: None
-
class
zipline.pipeline.factors.
BollingerBands
(inputs=sentinel('NotSpecified'), outputs=sentinel('NotSpecified'), window_length=sentinel('NotSpecified'), mask=sentinel('NotSpecified'), dtype=sentinel('NotSpecified'), missing_value=sentinel('NotSpecified'), ndim=sentinel('NotSpecified'), **kwargs)[source]¶ Bollinger Bands technical indicator. https://en.wikipedia.org/wiki/Bollinger_Bands
Default Inputs:
zipline.pipeline.data.EquityPricing.close
- Parameters
inputs (length-1 iterable[BoundColumn]) – The expression over which to compute bollinger bands.
window_length (int > 0) – Length of the lookback window over which to compute the bollinger bands.
k (float) – The number of standard deviations to add or subtract to create the upper and lower bands.
-
class
zipline.pipeline.factors.
BusinessDaysSincePreviousEvent
(inputs=sentinel('NotSpecified'), outputs=sentinel('NotSpecified'), window_length=sentinel('NotSpecified'), mask=sentinel('NotSpecified'), domain=sentinel('NotSpecified'), *args, **kwargs)[source]¶ Abstract class for business days since a previous event. Returns the number of business days (not trading days!) since the most recent event date for each asset.
This doesn’t use trading days for symmetry with BusinessDaysUntilNextEarnings.
Assets which announced or will announce the event today will produce a value of 0.0. Assets that announced the event on the previous business day will produce a value of 1.0.
Assets for which the event date is NaT will produce a value of NaN.
Example
BusinessDaysSincePreviousEvent
can be used to create an event-driven factor. For instance, you may want to only trade assets that have a data point with an asof_date in the last 5 business days. To do this, you can create aBusinessDaysSincePreviousEvent
factor, supplying the relevant asof_date column from your dataset as input, like this:# Factor computing number of days since most recent asof_date # per asset. days_since_event = BusinessDaysSincePreviousEvent( inputs=[MyDataset.asof_date] ) # Filter returning True for each asset whose most recent asof_date # was in the last 5 business days. recency_filter = (days_since_event <= 5)
-
class
zipline.pipeline.factors.
BusinessDaysUntilNextEvent
(inputs=sentinel('NotSpecified'), outputs=sentinel('NotSpecified'), window_length=sentinel('NotSpecified'), mask=sentinel('NotSpecified'), domain=sentinel('NotSpecified'), *args, **kwargs)[source]¶ Abstract class for business days since a next event. Returns the number of business days (not trading days!) until the next known event date for each asset.
This doesn’t use trading days because the trading calendar includes information that may not have been available to the algorithm at the time when compute is called.
For example, the NYSE closings September 11th 2001, would not have been known to the algorithm on September 10th.
Assets that announced or will announce the event today will produce a value of 0.0. Assets that will announce the event on the next upcoming business day will produce a value of 1.0.
Assets for which the event date is NaT will produce a value of NaN.
-
class
zipline.pipeline.factors.
DailyReturns
(inputs=sentinel('NotSpecified'), outputs=sentinel('NotSpecified'), window_length=sentinel('NotSpecified'), mask=sentinel('NotSpecified'), dtype=sentinel('NotSpecified'), missing_value=sentinel('NotSpecified'), ndim=sentinel('NotSpecified'), **kwargs)[source]¶ Calculates daily percent change in close price.
Default Inputs: [EquityPricing.close]
-
class
zipline.pipeline.factors.
ExponentialWeightedMovingAverage
(inputs=sentinel('NotSpecified'), outputs=sentinel('NotSpecified'), window_length=sentinel('NotSpecified'), mask=sentinel('NotSpecified'), dtype=sentinel('NotSpecified'), missing_value=sentinel('NotSpecified'), ndim=sentinel('NotSpecified'), **kwargs)[source]¶ Exponentially Weighted Moving Average
Default Inputs: None
Default Window Length: None
- Parameters
inputs (length-1 list/tuple of BoundColumn) – The expression over which to compute the average.
window_length (int > 0) – Length of the lookback window over which to compute the average.
decay_rate (float, 0 < decay_rate <= 1) –
Weighting factor by which to discount past observations.
When calculating historical averages, rows are multiplied by the sequence:
decay_rate, decay_rate ** 2, decay_rate ** 3, ...
Notes
This class can also be imported under the name
EWMA
.
See also
-
class
zipline.pipeline.factors.
ExponentialWeightedMovingStdDev
(inputs=sentinel('NotSpecified'), outputs=sentinel('NotSpecified'), window_length=sentinel('NotSpecified'), mask=sentinel('NotSpecified'), dtype=sentinel('NotSpecified'), missing_value=sentinel('NotSpecified'), ndim=sentinel('NotSpecified'), **kwargs)[source]¶ Exponentially Weighted Moving Standard Deviation
Default Inputs: None
Default Window Length: None
- Parameters
inputs (length-1 list/tuple of BoundColumn) – The expression over which to compute the average.
window_length (int > 0) – Length of the lookback window over which to compute the average.
decay_rate (float, 0 < decay_rate <= 1) –
Weighting factor by which to discount past observations.
When calculating historical averages, rows are multiplied by the sequence:
decay_rate, decay_rate ** 2, decay_rate ** 3, ...
Notes
This class can also be imported under the name
EWMSTD
.
See also
pandas.DataFrame.ewm()
-
class
zipline.pipeline.factors.
Latest
(inputs=sentinel('NotSpecified'), outputs=sentinel('NotSpecified'), window_length=sentinel('NotSpecified'), mask=sentinel('NotSpecified'), dtype=sentinel('NotSpecified'), missing_value=sentinel('NotSpecified'), ndim=sentinel('NotSpecified'), **kwargs)[source]¶ Factor producing the most recently-known value of inputs[0] on each day.
The .latest attribute of DataSet columns returns an instance of this Factor.
-
zipline.pipeline.factors.
MACDSignal
¶ alias of
zipline.pipeline.factors.technical.MovingAverageConvergenceDivergenceSignal
-
class
zipline.pipeline.factors.
MaxDrawdown
(inputs=sentinel('NotSpecified'), outputs=sentinel('NotSpecified'), window_length=sentinel('NotSpecified'), mask=sentinel('NotSpecified'), dtype=sentinel('NotSpecified'), missing_value=sentinel('NotSpecified'), ndim=sentinel('NotSpecified'), **kwargs)[source]¶ Max Drawdown
Default Inputs: None
Default Window Length: None
-
class
zipline.pipeline.factors.
Returns
(inputs=sentinel('NotSpecified'), outputs=sentinel('NotSpecified'), window_length=sentinel('NotSpecified'), mask=sentinel('NotSpecified'), dtype=sentinel('NotSpecified'), missing_value=sentinel('NotSpecified'), ndim=sentinel('NotSpecified'), **kwargs)[source]¶ Calculates the percent change in close price over the given window_length.
Default Inputs: [EquityPricing.close]
-
class
zipline.pipeline.factors.
RollingPearson
(base_factor, target, correlation_length, mask=sentinel('NotSpecified'))[source]¶ A Factor that computes pearson correlation coefficients between the columns of a given Factor and either the columns of another Factor/BoundColumn or a slice/single column of data.
- Parameters
base_factor (zipline.pipeline.Factor) – The factor for which to compute correlations of each of its columns with target.
target (zipline.pipeline.Term with a numeric dtype) – The term with which to compute correlations against each column of data produced by base_factor. This term may be a Factor, a BoundColumn or a Slice. If target is two-dimensional, correlations are computed asset-wise.
correlation_length (int) – Length of the lookback window over which to compute each correlation coefficient.
mask (zipline.pipeline.Filter, optional) – A Filter describing which assets (columns) of base_factor should have their correlation with target computed each day.
See also
scipy.stats.pearsonr()
,Factor.pearsonr()
,zipline.pipeline.factors.RollingPearsonOfReturns
Notes
Most users should call Factor.pearsonr rather than directly construct an instance of this class.
-
class
zipline.pipeline.factors.
RollingSpearman
(base_factor, target, correlation_length, mask=sentinel('NotSpecified'))[source]¶ A Factor that computes spearman rank correlation coefficients between the columns of a given Factor and either the columns of another Factor/BoundColumn or a slice/single column of data.
- Parameters
base_factor (zipline.pipeline.Factor) – The factor for which to compute correlations of each of its columns with target.
target (zipline.pipeline.Term with a numeric dtype) – The term with which to compute correlations against each column of data produced by base_factor. This term may be a Factor, a BoundColumn or a Slice. If target is two-dimensional, correlations are computed asset-wise.
correlation_length (int) – Length of the lookback window over which to compute each correlation coefficient.
mask (zipline.pipeline.Filter, optional) – A Filter describing which assets (columns) of base_factor should have their correlation with target computed each day.
See also
scipy.stats.spearmanr()
,Factor.spearmanr()
,zipline.pipeline.factors.RollingSpearmanOfReturns
Notes
Most users should call Factor.spearmanr rather than directly construct an instance of this class.
-
class
zipline.pipeline.factors.
RollingLinearRegressionOfReturns
(target, returns_length, regression_length, mask=sentinel('NotSpecified'))[source]¶ Perform an ordinary least-squares regression predicting the returns of all other assets on the given asset.
- Parameters
target (zipline.assets.Asset) – The asset to regress against all other assets.
returns_length (int >= 2) – Length of the lookback window over which to compute returns. Daily returns require a window length of 2.
regression_length (int >= 1) – Length of the lookback window over which to compute each regression.
mask (zipline.pipeline.Filter, optional) – A Filter describing which assets should be regressed against the target asset each day.
Notes
Computing this factor over many assets can be time consuming. It is recommended that a mask be used in order to limit the number of assets over which regressions are computed.
This factor is designed to return five outputs:
alpha, a factor that computes the intercepts of each regression.
beta, a factor that computes the slopes of each regression.
r_value, a factor that computes the correlation coefficient of each regression.
p_value, a factor that computes, for each regression, the two-sided p-value for a hypothesis test whose null hypothesis is that the slope is zero.
stderr, a factor that computes the standard error of the estimate of each regression.
For more help on factors with multiple outputs, see
zipline.pipeline.CustomFactor
.Examples
Let the following be example 10-day returns for three different assets:
SPY MSFT FB 2017-03-13 -.03 .03 .04 2017-03-14 -.02 -.03 .02 2017-03-15 -.01 .02 .01 2017-03-16 0 -.02 .01 2017-03-17 .01 .04 -.01 2017-03-20 .02 -.03 -.02 2017-03-21 .03 .01 -.02 2017-03-22 .04 -.02 -.02
Suppose we are interested in predicting each stock’s returns from SPY’s over rolling 5-day look back windows. We can compute rolling regression coefficients (alpha and beta) from 2017-03-17 to 2017-03-22 by doing:
regression_factor = RollingRegressionOfReturns( target=sid(8554), returns_length=10, regression_length=5, ) alpha = regression_factor.alpha beta = regression_factor.beta
The result of computing
alpha
from 2017-03-17 to 2017-03-22 gives:SPY MSFT FB 2017-03-17 0 .011 .003 2017-03-20 0 -.004 .004 2017-03-21 0 .007 .006 2017-03-22 0 .002 .008
And the result of computing
beta
from 2017-03-17 to 2017-03-22 gives:SPY MSFT FB 2017-03-17 1 .3 -1.1 2017-03-20 1 .2 -1 2017-03-21 1 -.3 -1 2017-03-22 1 -.3 -.9
Note that SPY’s column for alpha is all 0’s and for beta is all 1’s, as the regression line of SPY with itself is simply the function y = x.
To understand how each of the other values were calculated, take for example MSFT’s
alpha
andbeta
values on 2017-03-17 (.011 and .3, respectively). These values are the result of running a linear regression predicting MSFT’s returns from SPY’s returns, using values starting at 2017-03-17 and looking back 5 days. That is, the regression was run with x = [-.03, -.02, -.01, 0, .01] and y = [.03, -.03, .02, -.02, .04], and it produced a slope of .3 and an intercept of .011.
-
class
zipline.pipeline.factors.
RollingPearsonOfReturns
(target, returns_length, correlation_length, mask=sentinel('NotSpecified'))[source]¶ Calculates the Pearson product-moment correlation coefficient of the returns of the given asset with the returns of all other assets.
Pearson correlation is what most people mean when they say “correlation coefficient” or “R-value”.
- Parameters
target (zipline.assets.Asset) – The asset to correlate with all other assets.
returns_length (int >= 2) – Length of the lookback window over which to compute returns. Daily returns require a window length of 2.
correlation_length (int >= 1) – Length of the lookback window over which to compute each correlation coefficient.
mask (zipline.pipeline.Filter, optional) – A Filter describing which assets should have their correlation with the target asset computed each day.
Notes
Computing this factor over many assets can be time consuming. It is recommended that a mask be used in order to limit the number of assets over which correlations are computed.
Examples
Let the following be example 10-day returns for three different assets:
SPY MSFT FB 2017-03-13 -.03 .03 .04 2017-03-14 -.02 -.03 .02 2017-03-15 -.01 .02 .01 2017-03-16 0 -.02 .01 2017-03-17 .01 .04 -.01 2017-03-20 .02 -.03 -.02 2017-03-21 .03 .01 -.02 2017-03-22 .04 -.02 -.02
Suppose we are interested in SPY’s rolling returns correlation with each stock from 2017-03-17 to 2017-03-22, using a 5-day look back window (that is, we calculate each correlation coefficient over 5 days of data). We can achieve this by doing:
rolling_correlations = RollingPearsonOfReturns( target=sid(8554), returns_length=10, correlation_length=5, )
The result of computing
rolling_correlations
from 2017-03-17 to 2017-03-22 gives:SPY MSFT FB 2017-03-17 1 .15 -.96 2017-03-20 1 .10 -.96 2017-03-21 1 -.16 -.94 2017-03-22 1 -.16 -.85
Note that the column for SPY is all 1’s, as the correlation of any data series with itself is always 1. To understand how each of the other values were calculated, take for example the .15 in MSFT’s column. This is the correlation coefficient between SPY’s returns looking back from 2017-03-17 (-.03, -.02, -.01, 0, .01) and MSFT’s returns (.03, -.03, .02, -.02, .04).
-
class
zipline.pipeline.factors.
RollingSpearmanOfReturns
(target, returns_length, correlation_length, mask=sentinel('NotSpecified'))[source]¶ Calculates the Spearman rank correlation coefficient of the returns of the given asset with the returns of all other assets.
- Parameters
target (zipline.assets.Asset) – The asset to correlate with all other assets.
returns_length (int >= 2) – Length of the lookback window over which to compute returns. Daily returns require a window length of 2.
correlation_length (int >= 1) – Length of the lookback window over which to compute each correlation coefficient.
mask (zipline.pipeline.Filter, optional) – A Filter describing which assets should have their correlation with the target asset computed each day.
Notes
Computing this factor over many assets can be time consuming. It is recommended that a mask be used in order to limit the number of assets over which correlations are computed.
-
class
zipline.pipeline.factors.
SimpleBeta
(target, regression_length, allowed_missing_percentage=0.25)[source]¶ Factor producing the slope of a regression line between each asset’s daily returns to the daily returns of a single “target” asset.
- Parameters
target (zipline.Asset) – Asset against which other assets should be regressed.
regression_length (int) – Number of days of daily returns to use for the regression.
allowed_missing_percentage (float, optional) – Percentage of returns observations (between 0 and 1) that are allowed to be missing when calculating betas. Assets with more than this percentage of returns observations missing will produce values of NaN. Default behavior is that 25% of inputs can be missing.
-
compute
(today, assets, out, all_returns, target_returns, allowed_missing_count)[source]¶ Override this method with a function that writes a value into out.
-
property
target
¶ Get the target of the beta calculation.
-
class
zipline.pipeline.factors.
RSI
(inputs=sentinel('NotSpecified'), outputs=sentinel('NotSpecified'), window_length=sentinel('NotSpecified'), mask=sentinel('NotSpecified'), dtype=sentinel('NotSpecified'), missing_value=sentinel('NotSpecified'), ndim=sentinel('NotSpecified'), **kwargs)[source]¶ Relative Strength Index
Default Inputs:
zipline.pipeline.data.EquityPricing.close
Default Window Length: 15
-
class
zipline.pipeline.factors.
SimpleMovingAverage
(inputs=sentinel('NotSpecified'), outputs=sentinel('NotSpecified'), window_length=sentinel('NotSpecified'), mask=sentinel('NotSpecified'), dtype=sentinel('NotSpecified'), missing_value=sentinel('NotSpecified'), ndim=sentinel('NotSpecified'), **kwargs)[source]¶ Average Value of an arbitrary column
Default Inputs: None
Default Window Length: None
-
class
zipline.pipeline.factors.
VWAP
(inputs=sentinel('NotSpecified'), outputs=sentinel('NotSpecified'), window_length=sentinel('NotSpecified'), mask=sentinel('NotSpecified'), dtype=sentinel('NotSpecified'), missing_value=sentinel('NotSpecified'), ndim=sentinel('NotSpecified'), **kwargs)[source]¶ Volume Weighted Average Price
Default Inputs: [EquityPricing.close, EquityPricing.volume]
Default Window Length: None
-
class
zipline.pipeline.factors.
WeightedAverageValue
(inputs=sentinel('NotSpecified'), outputs=sentinel('NotSpecified'), window_length=sentinel('NotSpecified'), mask=sentinel('NotSpecified'), dtype=sentinel('NotSpecified'), missing_value=sentinel('NotSpecified'), ndim=sentinel('NotSpecified'), **kwargs)[source]¶ Helper for VWAP-like computations.
Default Inputs: None
Default Window Length: None
-
class
zipline.pipeline.factors.
PercentChange
(inputs=sentinel('NotSpecified'), outputs=sentinel('NotSpecified'), window_length=sentinel('NotSpecified'), mask=sentinel('NotSpecified'), dtype=sentinel('NotSpecified'), missing_value=sentinel('NotSpecified'), ndim=sentinel('NotSpecified'), **kwargs)[source]¶ Calculates the percent change over the given window_length.
Default Inputs: None
Default Window Length: None
Notes
Percent change is calculated as
(new - old) / abs(old)
.
-
class
zipline.pipeline.factors.
PeerCount
(inputs=sentinel('NotSpecified'), outputs=sentinel('NotSpecified'), window_length=sentinel('NotSpecified'), mask=sentinel('NotSpecified'), dtype=sentinel('NotSpecified'), missing_value=sentinel('NotSpecified'), ndim=sentinel('NotSpecified'), **kwargs)[source]¶ Peer Count of distinct categories in a given classifier. This factor is returned by the classifier instance method peer_count()
Default Inputs: None
Default Window Length: 1
Built-in Filters¶
-
class
zipline.pipeline.filters.
All
(inputs=sentinel('NotSpecified'), outputs=sentinel('NotSpecified'), window_length=sentinel('NotSpecified'), mask=sentinel('NotSpecified'), dtype=sentinel('NotSpecified'), missing_value=sentinel('NotSpecified'), ndim=sentinel('NotSpecified'), **kwargs)[source]¶ A Filter requiring that assets produce True for
window_length
consecutive days.Default Inputs: None
Default Window Length: None
-
class
zipline.pipeline.filters.
AllPresent
(inputs=sentinel('NotSpecified'), outputs=sentinel('NotSpecified'), window_length=sentinel('NotSpecified'), mask=sentinel('NotSpecified'), dtype=sentinel('NotSpecified'), missing_value=sentinel('NotSpecified'), ndim=sentinel('NotSpecified'), **kwargs)[source]¶ Pipeline filter indicating input term has data for a given window.
-
class
zipline.pipeline.filters.
Any
(inputs=sentinel('NotSpecified'), outputs=sentinel('NotSpecified'), window_length=sentinel('NotSpecified'), mask=sentinel('NotSpecified'), dtype=sentinel('NotSpecified'), missing_value=sentinel('NotSpecified'), ndim=sentinel('NotSpecified'), **kwargs)[source]¶ A Filter requiring that assets produce True for at least one day in the last
window_length
days.Default Inputs: None
Default Window Length: None
-
class
zipline.pipeline.filters.
AtLeastN
(inputs=sentinel('NotSpecified'), outputs=sentinel('NotSpecified'), window_length=sentinel('NotSpecified'), mask=sentinel('NotSpecified'), dtype=sentinel('NotSpecified'), missing_value=sentinel('NotSpecified'), ndim=sentinel('NotSpecified'), **kwargs)[source]¶ A Filter requiring that assets produce True for at least N days in the last
window_length
days.Default Inputs: None
Default Window Length: None
-
class
zipline.pipeline.filters.
SingleAsset
(asset)[source]¶ A Filter that computes to True only for the given asset.
-
class
zipline.pipeline.filters.
StaticAssets
(assets)[source]¶ A Filter that computes True for a specific set of predetermined assets.
StaticAssets
is mostly useful for debugging or for interactively computing pipeline terms for a fixed set of assets that are known ahead of time.- Parameters
assets (iterable[Asset]) – An iterable of assets for which to filter.
-
class
zipline.pipeline.filters.
StaticSids
(sids)[source]¶ A Filter that computes True for a specific set of predetermined sids.
StaticSids
is mostly useful for debugging or for interactively computing pipeline terms for a fixed set of sids that are known ahead of time.- Parameters
sids (iterable[int]) – An iterable of sids for which to filter.
Pipeline Engine¶
-
class
zipline.pipeline.engine.
PipelineEngine
[source]¶ -
abstract
run_pipeline
(pipeline, start_date, end_date, hooks=None)[source]¶ Compute values for
pipeline
fromstart_date
toend_date
.- Parameters
pipeline (zipline.pipeline.Pipeline) – The pipeline to run.
start_date (pd.Timestamp) – Start date of the computed matrix.
end_date (pd.Timestamp) – End date of the computed matrix.
hooks (list[implements(PipelineHooks)], optional) – Hooks for instrumenting Pipeline execution.
- Returns
result – A frame of computed results.
The
result
columns correspond to the entries of pipeline.columns, which should be a dictionary mapping strings to instances ofzipline.pipeline.Term
.For each date between
start_date
andend_date
,result
will contain a row for each asset that passed pipeline.screen. A screen ofNone
indicates that a row should be returned for each asset that existed each day.- Return type
pd.DataFrame
-
abstract
run_chunked_pipeline
(pipeline, start_date, end_date, chunksize, hooks=None)[source]¶ Compute values for
pipeline
fromstart_date
toend_date
, in date chunks of sizechunksize
.Chunked execution reduces memory consumption, and may reduce computation time depending on the contents of your pipeline.
- Parameters
pipeline (Pipeline) – The pipeline to run.
start_date (pd.Timestamp) – The start date to run the pipeline for.
end_date (pd.Timestamp) – The end date to run the pipeline for.
chunksize (int) – The number of days to execute at a time.
hooks (list[implements(PipelineHooks)], optional) – Hooks for instrumenting Pipeline execution.
- Returns
result – A frame of computed results.
The
result
columns correspond to the entries of pipeline.columns, which should be a dictionary mapping strings to instances ofzipline.pipeline.Term
.For each date between
start_date
andend_date
,result
will contain a row for each asset that passed pipeline.screen. A screen ofNone
indicates that a row should be returned for each asset that existed each day.- Return type
pd.DataFrame
-
abstract
-
class
zipline.pipeline.engine.
SimplePipelineEngine
(get_loader, asset_finder, default_domain=GENERIC, populate_initial_workspace=None, default_hooks=None)[source]¶ PipelineEngine class that computes each term independently.
- Parameters
get_loader (callable) – A function that is given a loadable term and returns a PipelineLoader to use to retrieve raw data for that term.
asset_finder (zipline.assets.AssetFinder) – An AssetFinder instance. We depend on the AssetFinder to determine which assets are in the top-level universe at any point in time.
populate_initial_workspace (callable, optional) – A function which will be used to populate the initial workspace when computing a pipeline. See
zipline.pipeline.engine.default_populate_initial_workspace()
for more info.default_hooks (list, optional) – List of hooks that should be used to instrument all pipelines executed by this engine.
-
__init__
(get_loader, asset_finder, default_domain=GENERIC, populate_initial_workspace=None, default_hooks=None)[source]¶ Initialize self. See help(type(self)) for accurate signature.
-
run_chunked_pipeline
(pipeline, start_date, end_date, chunksize, hooks=None)[source]¶ Compute values for
pipeline
fromstart_date
toend_date
, in date chunks of sizechunksize
.Chunked execution reduces memory consumption, and may reduce computation time depending on the contents of your pipeline.
- Parameters
pipeline (Pipeline) – The pipeline to run.
start_date (pd.Timestamp) – The start date to run the pipeline for.
end_date (pd.Timestamp) – The end date to run the pipeline for.
chunksize (int) – The number of days to execute at a time.
hooks (list[implements(PipelineHooks)], optional) – Hooks for instrumenting Pipeline execution.
- Returns
result – A frame of computed results.
The
result
columns correspond to the entries of pipeline.columns, which should be a dictionary mapping strings to instances ofzipline.pipeline.Term
.For each date between
start_date
andend_date
,result
will contain a row for each asset that passed pipeline.screen. A screen ofNone
indicates that a row should be returned for each asset that existed each day.- Return type
pd.DataFrame
-
run_pipeline
(pipeline, start_date, end_date, hooks=None)[source]¶ Compute values for
pipeline
fromstart_date
toend_date
.- Parameters
pipeline (zipline.pipeline.Pipeline) – The pipeline to run.
start_date (pd.Timestamp) – Start date of the computed matrix.
end_date (pd.Timestamp) – End date of the computed matrix.
hooks (list[implements(PipelineHooks)], optional) – Hooks for instrumenting Pipeline execution.
- Returns
result – A frame of computed results.
The
result
columns correspond to the entries of pipeline.columns, which should be a dictionary mapping strings to instances ofzipline.pipeline.Term
.For each date between
start_date
andend_date
,result
will contain a row for each asset that passed pipeline.screen. A screen ofNone
indicates that a row should be returned for each asset that existed each day.- Return type
pd.DataFrame
-
zipline.pipeline.engine.
default_populate_initial_workspace
(initial_workspace, root_mask_term, execution_plan, dates, assets)[source]¶ The default implementation for
populate_initial_workspace
. This function returns theinitial_workspace
argument without making any modifications.- Parameters
initial_workspace (dict[array-like]) – The initial workspace before we have populated it with any cached terms.
root_mask_term (Term) – The root mask term, normally
AssetExists()
. This is needed to compute the dates for individual terms.execution_plan (ExecutionPlan) – The execution plan for the pipeline being run.
dates (pd.DatetimeIndex) – All of the dates being requested in this pipeline run including the extra dates for look back windows.
assets (pd.Int64Index) – All of the assets that exist for the window being computed.
- Returns
populated_initial_workspace – The workspace to begin computations with.
- Return type
dict[term, array-like]
Asset Metadata¶
-
class
zipline.assets.
Asset
¶ Base class for entities that can be owned by a trading algorithm.
-
symbol
¶ Most recent ticker under which the asset traded. This field can change without warning if the asset changes tickers. Use
sid
if you need a persistent identifier.- Type
-
exchange_full
¶ Full name of the exchange on which the asset trades (e.g., ‘NEW YORK STOCK EXCHANGE’).
- Type
-
exchange_info
¶ Information about the exchange this asset is listed on.
- Type
zipline.assets.ExchangeInfo
-
start_date
¶ Date on which the asset first traded.
- Type
pd.Timestamp
-
end_date
¶ Last date on which the asset traded. On Quantopian, this value is set to the current (real time) date for assets that are still trading.
- Type
pd.Timestamp
-
auto_close_date
¶ Date on which positions in this asset will be automatically liquidated to cash during a simulation. By default, this is three days after
end_date
.- Type
pd.Timestamp
-
from_dict
()¶ Build an Asset instance from a dict.
-
is_alive_for_session
()¶ Returns whether the asset is alive at the given dt.
- Parameters
session_label (pd.Timestamp) – The desired session label to check. (midnight UTC)
- Returns
boolean
- Return type
whether the asset is alive at the given dt.
-
is_exchange_open
()¶ - Parameters
dt_minute (pd.Timestamp (UTC, tz-aware)) – The minute to check.
- Returns
boolean
- Return type
whether the asset’s exchange is open at the given minute.
-
-
class
zipline.assets.
Equity
¶ Asset subclass representing partial ownership of a company, trust, or partnership.
-
security_end_date
¶ This property should be deprecated and is only present for backwards compatibility
- Type
DEPRECATION
-
security_name
¶ This property should be deprecated and is only present for backwards compatibility
- Type
DEPRECATION
-
security_start_date
¶ This property should be deprecated and is only present for backwards compatibility
- Type
DEPRECATION
-
Trading Calendar API¶
-
zipline.utils.calendars.
get_calendar
(name)¶ Retrieves an instance of an TradingCalendar whose name is given.
- Parameters
name (str) – The name of the TradingCalendar to be retrieved.
- Returns
calendar – The desired calendar.
- Return type
-
class
zipline.utils.calendars.
TradingCalendar
(start=Timestamp('1990-01-01 00:00:00+0000', tz='UTC'), end=Timestamp('2022-07-26 18:56:50.922955+0000', tz='UTC'))[source]¶ An TradingCalendar represents the timing information of a single market exchange.
The timing information is made up of two parts: sessions, and opens/closes.
A session represents a contiguous set of minutes, and has a label that is midnight UTC. It is important to note that a session label should not be considered a specific point in time, and that midnight UTC is just being used for convenience.
For each session, we store the open and close time in UTC time.
-
property
adhoc_holidays
¶ returns: list :rtype: A list of timestamps representing unplanned closes.
-
property
break_end_times
¶ Returns a optional list of tuples of (start_date, break_end_time). If the break end time is constant throughout the calendar, use None for the start_date. If there is no break, return None.
-
break_start_and_end_for_session
(session_label)[source]¶ Returns a tuple of timestamps of the break start and end of the session represented by the given label.
- Parameters
session_label (pd.Timestamp) – The session whose break start and end are desired.
- Returns
The break start and end for the given session.
- Return type
(Timestamp, Timestamp)
-
property
break_start_times
¶ Returns a optional list of tuples of (start_date, break_start_time). If the break start time is constant throughout the calendar, use None for the start_date. If there is no break, return None.
-
abstract property
close_times
¶ Returns a list of tuples of (start_date, close_time). If the close time is constant throughout the calendar, use None for the start_date.
-
execution_minutes_for_session
(session_label)[source]¶ Given a session label, return the execution minutes for that session.
- Parameters
session_label (pd.Timestamp (midnight UTC)) – A session label whose session’s minutes are desired.
- Returns
All the execution minutes for the given session.
- Return type
pd.DateTimeIndex
-
is_open_on_minute
(dt, ignore_breaks=False)[source]¶ Given a dt, return whether this exchange is open at the given dt.
-
is_session
(dt)[source]¶ Given a dt, returns whether it’s a valid session label.
- Parameters
dt (pd.Timestamp) – The dt that is being tested.
- Returns
Whether the given dt is a valid session label.
- Return type
-
minute_index_to_session_labels
(index)[source]¶ Given a sorted DatetimeIndex of market minutes, return a DatetimeIndex of the corresponding session labels.
- Parameters
index (pd.DatetimeIndex or pd.Series) – The ordered list of market minutes we want session labels for.
- Returns
The list of session labels corresponding to the given minutes.
- Return type
pd.DatetimeIndex (UTC)
-
minute_to_session_label
(dt, direction='next')[source]¶ Given a minute, get the label of its containing session.
- Parameters
dt (pd.Timestamp or nanosecond offset) – The dt for which to get the containing session.
direction (str) –
“next” (default) means that if the given dt is not part of a session, return the label of the next session.
”previous” means that if the given dt is not part of a session, return the label of the previous session.
”none” means that a KeyError will be raised if the given dt is not part of a session.
- Returns
The label of the containing session.
- Return type
pd.Timestamp (midnight UTC)
-
minutes_count_for_sessions_in_range
(start_session, end_session)[source]¶ - Parameters
start_session (pd.Timestamp) – The first session.
end_session (pd.Timestamp) – The last session.
- Returns
int – between start_session and end_session, inclusive.
- Return type
The total number of minutes for the contiguous chunk of sessions.
-
minutes_for_session
(session_label)[source]¶ Given a session label, return the minutes for that session.
- Parameters
session_label (pd.Timestamp (midnight UTC)) – A session label whose session’s minutes are desired.
- Returns
All the minutes for the given session.
- Return type
pd.DateTimeIndex
-
minutes_for_sessions_in_range
(start_session_label, end_session_label)[source]¶ Returns all the minutes for all the sessions from the given start session label to the given end session label, inclusive.
- Parameters
start_session_label (pd.Timestamp) – The label of the first session in the range.
end_session_label (pd.Timestamp) – The label of the last session in the range.
- Returns
The minutes in the desired range.
- Return type
pd.DatetimeIndex
-
minutes_in_range
(start_minute, end_minute)[source]¶ Given start and end minutes, return all the calendar minutes in that range, inclusive.
Given minutes don’t need to be calendar minutes.
- Parameters
start_minute (pd.Timestamp) – The minute representing the start of the desired range.
end_minute (pd.Timestamp) – The minute representing the end of the desired range.
- Returns
The minutes in the desired range.
- Return type
pd.DatetimeIndex
-
next_close
(dt)[source]¶ Given a dt, returns the next close.
- Parameters
dt (pd.Timestamp) – The dt for which to get the next close.
- Returns
The UTC timestamp of the next close.
- Return type
pd.Timestamp
-
next_minute
(dt)[source]¶ Given a dt, return the next exchange minute. If the given dt is not an exchange minute, returns the next exchange open.
- Parameters
dt (pd.Timestamp) – The dt for which to get the next exchange minute.
- Returns
The next exchange minute.
- Return type
pd.Timestamp
-
next_open
(dt)[source]¶ Given a dt, returns the next open.
If the given dt happens to be a session open, the next session’s open will be returned.
- Parameters
dt (pd.Timestamp) – The dt for which to get the next open.
- Returns
The UTC timestamp of the next open.
- Return type
pd.Timestamp
-
next_session_label
(session_label)[source]¶ Given a session label, returns the label of the next session.
- Parameters
session_label (pd.Timestamp) – A session whose next session is desired.
- Returns
The next session label (midnight UTC).
- Return type
pd.Timestamp
Notes
Raises ValueError if the given session is the last session in this calendar.
-
open_and_close_for_session
(session_label)[source]¶ Returns a tuple of timestamps of the open and close of the session represented by the given label.
- Parameters
session_label (pd.Timestamp) – The session whose open and close are desired.
- Returns
The open and close for the given session.
- Return type
(Timestamp, Timestamp)
-
abstract property
open_times
¶ Returns a list of tuples of (start_date, open_time). If the open time is constant throughout the calendar, use None for the start_date.
-
previous_close
(dt)[source]¶ Given a dt, returns the previous close.
- Parameters
dt (pd.Timestamp) – The dt for which to get the previous close.
- Returns
The UTC timestamp of the previous close.
- Return type
pd.Timestamp
-
previous_minute
(dt)[source]¶ Given a dt, return the previous exchange minute.
Raises KeyError if the given timestamp is not an exchange minute.
- Parameters
dt (pd.Timestamp) – The dt for which to get the previous exchange minute.
- Returns
The previous exchange minute.
- Return type
pd.Timestamp
-
previous_open
(dt)[source]¶ Given a dt, returns the previous open.
- Parameters
dt (pd.Timestamp) – The dt for which to get the previous open.
- Returns
The UTC imestamp of the previous open.
- Return type
pd.Timestamp
-
previous_session_label
(session_label)[source]¶ Given a session label, returns the label of the previous session.
- Parameters
session_label (pd.Timestamp) – A session whose previous session is desired.
- Returns
The previous session label (midnight UTC).
- Return type
pd.Timestamp
Notes
Raises ValueError if the given session is the first session in this calendar.
-
property
regular_holidays
¶ returns: * pd.AbstractHolidayCalendar (a calendar containing the regular holidays) * for this calendar
-
session_distance
(start_session_label, end_session_label)[source]¶ Given a start and end session label, returns the distance between them. For example, for three consecutive sessions Mon., Tues., and Wed,
session_distance(Mon, Wed)
returns 3. Ifstart_session
is afterend_session
, the value will be negated.- Parameters
start_session_label (pd.Timestamp) – The label of the start session.
end_session_label (pd.Timestamp) – The label of the ending session inclusive.
- Returns
The distance between the two sessions.
- Return type
-
sessions_in_range
(start_session_label, end_session_label)[source]¶ Given start and end session labels, return all the sessions in that range, inclusive.
- Parameters
start_session_label (pd.Timestamp (midnight UTC)) – The label representing the first session of the desired range.
end_session_label (pd.Timestamp (midnight UTC)) – The label representing the last session of the desired range.
- Returns
The desired sessions.
- Return type
pd.DatetimeIndex
-
sessions_window
(session_label, count)[source]¶ Given a session label and a window size, returns a list of sessions of size count + 1, that either starts with the given session (if count is positive) or ends with the given session (if count is negative).
- Parameters
session_label (pd.Timestamp) – The label of the initial session.
count (int) – Defines the length and the direction of the window.
- Returns
The desired sessions.
- Return type
pd.DatetimeIndex
-
property
special_closes
¶ A list of special close times and corresponding HolidayCalendars.
- Returns
list
- Return type
List of (time, AbstractHolidayCalendar) tuples
-
property
special_closes_adhoc
¶ returns: list – closes that cannot be codified into rules. :rtype: List of (time, DatetimeIndex) tuples that represent special
-
property
special_opens
¶ A list of special open times and corresponding HolidayCalendars.
- Returns
list
- Return type
List of (time, AbstractHolidayCalendar) tuples
-
property
special_opens_adhoc
¶ returns: list – closes that cannot be codified into rules. :rtype: List of (time, DatetimeIndex) tuples that represent special
-
property
weekmask
¶ String indicating the days of the week on which the market is open.
Default is ‘1111100’ (i.e., Monday-Friday).
See also
-
property
-
zipline.utils.calendars.
register_calendar
(name, calendar, force=False)¶ Registers a calendar for retrieval by the get_calendar method.
- Parameters
name (str) – The key with which to register this calendar.
calendar (TradingCalendar) – The calendar to be registered for retrieval.
force (bool, optional) – If True, old calendars will be overwritten on a name collision. If False, name collisions will raise an exception. Default is False.
- Raises
CalendarNameCollision – If a calendar is already registered with the given calendar’s name.
-
zipline.utils.calendars.
register_calendar_type
(name, calendar_type, force=False)¶ Registers a calendar by type.
This is useful for registering a new calendar to be lazily instantiated at some future point in time.
- Parameters
- Raises
CalendarNameCollision – If a calendar is already registered with the given calendar’s name.
-
zipline.utils.calendars.
deregister_calendar
(name)¶ If a calendar is registered with the given name, it is de-registered.
- Parameters
cal_name (str) – The name of the calendar to be deregistered.
-
zipline.utils.calendars.
clear_calendars
()¶ Deregisters all current registered calendars
Data API¶
Writers¶
-
class
zipline.data.minute_bars.
BcolzMinuteBarWriter
(rootdir, calendar, start_session, end_session, minutes_per_day, default_ohlc_ratio=1000, ohlc_ratios_per_sid=None, expectedlen=1474200, write_metadata=True)[source]¶ Class capable of writing minute OHLCV data to disk into bcolz format.
- Parameters
rootdir (string) – Path to the root directory into which to write the metadata and bcolz subdirectories.
calendar (trading_calendars.trading_calendar.TradingCalendar) – The trading calendar on which to base the minute bars. Used to get the market opens used as a starting point for each periodic span of minutes in the index, and the market closes that correspond with the market opens.
minutes_per_day (int) – The number of minutes per each period. Defaults to 390, the mode of minutes in NYSE trading days.
start_session (datetime) – The first trading session in the data set.
end_session (datetime) – The last trading session in the data set.
default_ohlc_ratio (int, optional) – The default ratio by which to multiply the pricing data to convert from floats to integers that fit within np.uint32. If ohlc_ratios_per_sid is None or does not contain a mapping for a given sid, this ratio is used. Default is OHLC_RATIO (1000).
ohlc_ratios_per_sid (dict, optional) – A dict mapping each sid in the output to the ratio by which to multiply the pricing data to convert the floats from floats to an integer to fit within the np.uint32.
expectedlen (int, optional) –
The expected length of the dataset, used when creating the initial bcolz ctable.
If the expectedlen is not used, the chunksize and corresponding compression ratios are not ideal.
Defaults to supporting 15 years of NYSE equity market data. see: http://bcolz.blosc.org/opt-tips.html#informing-about-the-length-of-your-carrays # noqa
write_metadata (bool, optional) – If True, writes the minute bar metadata (on init of the writer). If False, no metadata is written (existing metadata is retained). Default is True.
Notes
Writes a bcolz directory for each individual sid, all contained within a root directory which also contains metadata about the entire dataset.
Each individual asset’s data is stored as a bcolz table with a column for each pricing field: (open, high, low, close, volume)
The open, high, low, and close columns are integers which are 1000 times the quoted price, so that the data can represented and stored as an np.uint32, supporting market prices quoted up to the thousands place.
volume is a np.uint32 with no mutation of the tens place.
The ‘index’ for each individual asset are a repeating period of minutes of length minutes_per_day starting from each market open. The file format does not account for half-days. e.g.: 2016-01-19 14:31 2016-01-19 14:32 … 2016-01-19 20:59 2016-01-19 21:00 2016-01-20 14:31 2016-01-20 14:32 … 2016-01-20 20:59 2016-01-20 21:00
All assets are written with a common ‘index’, sharing a common first trading day. Assets that do not begin trading until after the first trading day will have zeros for all pricing data up and until data is traded.
‘index’ is in quotations, because bcolz does not provide an index. The format allows index-like behavior by writing each minute’s data into the corresponding position of the enumeration of the aforementioned datetime index.
The datetimes which correspond to each position are written in the metadata as integer nanoseconds since the epoch into the minute_index key.
-
data_len_for_day
(day)[source]¶ Return the number of data points up to and including the provided day.
-
last_date_in_output_for_sid
(sid)[source]¶ - Parameters
sid (int) – Asset identifier.
- Returns
out – The midnight of the last date written in to the output for the given sid.
- Return type
pd.Timestamp
-
classmethod
open
(rootdir, end_session=None)[source]¶ Open an existing
rootdir
for writing.- Parameters
end_session (Timestamp (optional)) – When appending, the intended new
end_session
.
-
pad
(sid, date)[source]¶ Fill sid container with empty data through the specified date.
If the last recorded trade is not at the close, then that day will be padded with zeros until its close. Any day after that (up to and including the specified date) will be padded with minute_per_day worth of zeros
- Parameters
sid (int) – The asset identifier for the data being written.
date (datetime-like) – The date used to calculate how many slots to be pad. The padding is done through the date, i.e. after the padding is done the last_date_in_output_for_sid will be equal to date
-
set_sid_attrs
(sid, **kwargs)[source]¶ Write all the supplied kwargs as attributes of the sid’s file.
-
sidpath
(sid)[source]¶ - Parameters
sid (int) – Asset identifier.
- Returns
out – Full path to the bcolz rootdir for the given sid.
- Return type
string
-
write
(data, show_progress=False, invalid_data_behavior='warn')[source]¶ Write a stream of minute data.
- Parameters
data (iterable[(int, pd.DataFrame)]) –
The data to write. Each element should be a tuple of sid, data where data has the following format:
- columns(‘open’, ‘high’, ‘low’, ‘close’, ‘volume’)
open : float64 high : float64 low : float64 close : float64 volume : float64|int64
index : DatetimeIndex of market minutes.
A given sid may appear more than once in
data
; however, the dates must be strictly increasing.show_progress (bool, optional) – Whether or not to show a progress bar while writing.
-
write_cols
(sid, dts, cols, invalid_data_behavior='warn')[source]¶ Write the OHLCV data for the given sid. If there is no bcolz ctable yet created for the sid, create it. If the length of the bcolz ctable is not exactly to the date before the first day provided, fill the ctable with 0s up to that date.
- Parameters
sid (int) – The asset identifier for the data being written.
dts (datetime64 array) – The dts corresponding to values in cols.
cols (dict of str -> np.array) – dict of market data with the following characteristics. keys are (‘open’, ‘high’, ‘low’, ‘close’, ‘volume’) open : float64 high : float64 low : float64 close : float64 volume : float64|int64
-
write_sid
(sid, df, invalid_data_behavior='warn')[source]¶ Write the OHLCV data for the given sid. If there is no bcolz ctable yet created for the sid, create it. If the length of the bcolz ctable is not exactly to the date before the first day provided, fill the ctable with 0s up to that date.
- Parameters
sid (int) – The asset identifer for the data being written.
df (pd.DataFrame) –
DataFrame of market data with the following characteristics. columns : (‘open’, ‘high’, ‘low’, ‘close’, ‘volume’)
open : float64 high : float64 low : float64 close : float64 volume : float64|int64
index : DatetimeIndex of market minutes.
-
class
zipline.data.bcolz_daily_bars.
BcolzDailyBarWriter
(filename, calendar, start_session, end_session)[source]¶ Class capable of writing daily OHLCV data to disk in a format that can be read efficiently by BcolzDailyOHLCVReader.
- Parameters
filename (str) – The location at which we should write our output.
calendar (zipline.utils.calendar.trading_calendar) – Calendar to use to compute asset calendar offsets.
start_session (pd.Timestamp) – Midnight UTC session label.
end_session (pd.Timestamp) – Midnight UTC session label.
-
write
(data, assets=None, show_progress=False, invalid_data_behavior='warn')[source]¶ - Parameters
data (iterable[tuple[int, pandas.DataFrame or bcolz.ctable]]) – The data chunks to write. Each chunk should be a tuple of sid and the data for that asset.
assets (set[int], optional) – The assets that should be in
data
. If this is provided we will checkdata
against the assets and provide better progress information.show_progress (bool, optional) – Whether or not to show a progress bar while writing.
invalid_data_behavior ({'warn', 'raise', 'ignore'}, optional) – What to do when data is encountered that is outside the range of a uint32.
- Returns
table – The newly-written table.
- Return type
bcolz.ctable
-
write_csvs
(asset_map, show_progress=False, invalid_data_behavior='warn')[source]¶ Read CSVs as DataFrames from our asset map.
- Parameters
asset_map (dict[int -> str]) – A mapping from asset id to file path with the CSV data for that asset
show_progress (bool) – Whether or not to show a progress bar while writing.
invalid_data_behavior ({'warn', 'raise', 'ignore'}) – What to do when data is encountered that is outside the range of a uint32.
-
class
zipline.data.adjustments.
SQLiteAdjustmentWriter
(conn_or_path, equity_daily_bar_reader, overwrite=False)[source]¶ Writer for data to be read by SQLiteAdjustmentReader
- Parameters
conn_or_path (str or sqlite3.Connection) – A handle to the target sqlite database.
equity_daily_bar_reader (SessionBarReader) – Daily bar reader to use for dividend writes.
overwrite (bool, optional, default=False) – If True and conn_or_path is a string, remove any existing files at the given path before connecting.
-
calc_dividend_ratios
(dividends)[source]¶ Calculate the ratios to apply to equities when looking back at pricing history so that the price is smoothed over the ex_date, when the market adjusts to the change in equity value due to upcoming dividend.
- Returns
A frame in the same format as splits and mergers, with keys - sid, the id of the equity - effective_date, the date in seconds on which to apply the ratio. - ratio, the ratio to apply to backwards looking pricing data.
- Return type
DataFrame
-
write
(splits=None, mergers=None, dividends=None, stock_dividends=None)[source]¶ Writes data to a SQLite file to be read by SQLiteAdjustmentReader.
- Parameters
splits (pandas.DataFrame, optional) –
- Dataframe containing split data. The format of this dataframe is:
- effective_dateint
The date, represented as seconds since Unix epoch, on which the adjustment should be applied.
- ratiofloat
A value to apply to all data earlier than the effective date. For open, high, low, and close those values are multiplied by the ratio. Volume is divided by this value.
- sidint
The asset id associated with this adjustment.
mergers (pandas.DataFrame, optional) –
- DataFrame containing merger data. The format of this dataframe is:
- effective_dateint
The date, represented as seconds since Unix epoch, on which the adjustment should be applied.
- ratiofloat
A value to apply to all data earlier than the effective date. For open, high, low, and close those values are multiplied by the ratio. Volume is unaffected.
- sidint
The asset id associated with this adjustment.
dividends (pandas.DataFrame, optional) –
- DataFrame containing dividend data. The format of the dataframe is:
- sidint
The asset id associated with this adjustment.
- ex_datedatetime64
The date on which an equity must be held to be eligible to receive payment.
- declared_datedatetime64
The date on which the dividend is announced to the public.
- pay_datedatetime64
The date on which the dividend is distributed.
- record_datedatetime64
The date on which the stock ownership is checked to determine distribution of dividends.
- amountfloat
The cash amount paid for each share.
Dividend ratios are calculated as:
1.0 - (dividend_value / "close on day prior to ex_date")
stock_dividends (pandas.DataFrame, optional) –
DataFrame containing stock dividend data. The format of the dataframe is:
- sidint
The asset id associated with this adjustment.
- ex_datedatetime64
The date on which an equity must be held to be eligible to receive payment.
- declared_datedatetime64
The date on which the dividend is announced to the public.
- pay_datedatetime64
The date on which the dividend is distributed.
- record_datedatetime64
The date on which the stock ownership is checked to determine distribution of dividends.
- payment_sidint
The asset id of the shares that should be paid instead of cash.
- ratiofloat
The ratio of currently held shares in the held sid that should be paid with new shares of the payment_sid.
-
class
zipline.assets.
AssetDBWriter
(engine)[source]¶ Class used to write data to an assets db.
- Parameters
engine (Engine or str) – An SQLAlchemy engine or path to a SQL database.
-
init_db
(txn=None)[source]¶ Connect to database and create tables.
- Parameters
txn (sa.engine.Connection, optional) – The transaction to execute in. If this is not provided, a new transaction will be started with the engine provided.
- Returns
metadata – The metadata that describes the new assets db.
- Return type
sa.MetaData
-
write
(equities=None, futures=None, exchanges=None, root_symbols=None, equity_supplementary_mappings=None, chunk_size=999)[source]¶ Write asset metadata to a sqlite database.
- Parameters
equities (pd.DataFrame, optional) –
The equity metadata. The columns for this dataframe are:
- symbolstr
The ticker symbol for this equity.
- asset_namestr
The full name for this asset.
- start_datedatetime
The date when this asset was created.
- end_datedatetime, optional
The last date we have trade data for this asset.
- first_tradeddatetime, optional
The first date we have trade data for this asset.
- auto_close_datedatetime, optional
The date on which to close any positions in this asset.
- exchangestr
The exchange where this asset is traded.
The index of this dataframe should contain the sids.
futures (pd.DataFrame, optional) –
The future contract metadata. The columns for this dataframe are:
- symbolstr
The ticker symbol for this futures contract.
- root_symbolstr
The root symbol, or the symbol with the expiration stripped out.
- asset_namestr
The full name for this asset.
- start_datedatetime, optional
The date when this asset was created.
- end_datedatetime, optional
The last date we have trade data for this asset.
- first_tradeddatetime, optional
The first date we have trade data for this asset.
- exchangestr
The exchange where this asset is traded.
- notice_datedatetime
The date when the owner of the contract may be forced to take physical delivery of the contract’s asset.
- expiration_datedatetime
The date when the contract expires.
- auto_close_datedatetime
The date when the broker will automatically close any positions in this contract.
- tick_sizefloat
The minimum price movement of the contract.
- multiplier: float
The amount of the underlying asset represented by this contract.
exchanges (pd.DataFrame, optional) –
The exchanges where assets can be traded. The columns of this dataframe are:
- exchangestr
The full name of the exchange.
- canonical_namestr
The canonical name of the exchange.
- country_codestr
The ISO 3166 alpha-2 country code of the exchange.
root_symbols (pd.DataFrame, optional) –
The root symbols for the futures contracts. The columns for this dataframe are:
- root_symbolstr
The root symbol name.
- root_symbol_idint
The unique id for this root symbol.
- sectorstring, optional
The sector of this root symbol.
- descriptionstring, optional
A short description of this root symbol.
- exchangestr
The exchange where this root symbol is traded.
equity_supplementary_mappings (pd.DataFrame, optional) – Additional mappings from values of abitrary type to assets.
chunk_size (int, optional) – The amount of rows to write to the SQLite table at once. This defaults to the default number of bind params in sqlite. If you have compiled sqlite3 with more bind or less params you may want to pass that value here.
See also
zipline.assets.asset_finder()
-
write_direct
(equities=None, equity_symbol_mappings=None, equity_supplementary_mappings=None, futures=None, exchanges=None, root_symbols=None, chunk_size=999)[source]¶ Write asset metadata to a sqlite database in the format that it is stored in the assets db.
- Parameters
equities (pd.DataFrame, optional) –
The equity metadata. The columns for this dataframe are:
- symbolstr
The ticker symbol for this equity.
- asset_namestr
The full name for this asset.
- start_datedatetime
The date when this asset was created.
- end_datedatetime, optional
The last date we have trade data for this asset.
- first_tradeddatetime, optional
The first date we have trade data for this asset.
- auto_close_datedatetime, optional
The date on which to close any positions in this asset.
- exchangestr
The exchange where this asset is traded.
The index of this dataframe should contain the sids.
futures (pd.DataFrame, optional) –
The future contract metadata. The columns for this dataframe are:
- symbolstr
The ticker symbol for this futures contract.
- root_symbolstr
The root symbol, or the symbol with the expiration stripped out.
- asset_namestr
The full name for this asset.
- start_datedatetime, optional
The date when this asset was created.
- end_datedatetime, optional
The last date we have trade data for this asset.
- first_tradeddatetime, optional
The first date we have trade data for this asset.
- exchangestr
The exchange where this asset is traded.
- notice_datedatetime
The date when the owner of the contract may be forced to take physical delivery of the contract’s asset.
- expiration_datedatetime
The date when the contract expires.
- auto_close_datedatetime
The date when the broker will automatically close any positions in this contract.
- tick_sizefloat
The minimum price movement of the contract.
- multiplier: float
The amount of the underlying asset represented by this contract.
exchanges (pd.DataFrame, optional) –
The exchanges where assets can be traded. The columns of this dataframe are:
- exchangestr
The full name of the exchange.
- canonical_namestr
The canonical name of the exchange.
- country_codestr
The ISO 3166 alpha-2 country code of the exchange.
root_symbols (pd.DataFrame, optional) –
The root symbols for the futures contracts. The columns for this dataframe are:
- root_symbolstr
The root symbol name.
- root_symbol_idint
The unique id for this root symbol.
- sectorstring, optional
The sector of this root symbol.
- descriptionstring, optional
A short description of this root symbol.
- exchangestr
The exchange where this root symbol is traded.
equity_supplementary_mappings (pd.DataFrame, optional) – Additional mappings from values of abitrary type to assets.
chunk_size (int, optional) – The amount of rows to write to the SQLite table at once. This defaults to the default number of bind params in sqlite. If you have compiled sqlite3 with more bind or less params you may want to pass that value here.
Readers¶
-
class
zipline.data.minute_bars.
BcolzMinuteBarReader
(rootdir, sid_cache_sizes=mappingproxy({'close': 3000, 'open': 1550, 'high': 1550, 'low': 1550, 'volume': 1550}))[source]¶ Reader for data written by BcolzMinuteBarWriter
- Parameters
rootdir (string) – The root directory containing the metadata and asset bcolz directories.
-
property
first_trading_day
¶ - returns: dt – The first trading day (session) for which the reader can provide
data.
- Return type
pd.Timestamp
-
get_last_traded_dt
(asset, dt)[source]¶ Get the latest minute on or before
dt
in whichasset
traded.If there are no trades on or before
dt
, returnspd.NaT
.- Parameters
asset (zipline.asset.Asset) – The asset for which to get the last traded minute.
dt (pd.Timestamp) – The minute at which to start searching for the last traded minute.
- Returns
last_traded – The dt of the last trade for the given asset, using the input dt as a vantage point.
- Return type
pd.Timestamp
-
get_value
(sid, dt, field)[source]¶ Retrieve the pricing info for the given sid, dt, and field.
- Parameters
sid (int) – Asset identifier.
dt (datetime-like) – The datetime at which the trade occurred.
field (string) – The type of pricing data to retrieve. (‘open’, ‘high’, ‘low’, ‘close’, ‘volume’)
- Returns
out (float|int)
The market data for the given sid, dt, and field coordinates.
For OHLC – Returns a float if a trade occurred at the given dt. If no trade occurred, a np.nan is returned.
For volume – Returns the integer value of the volume. (A volume of 0 signifies no trades for the given dt.)
-
load_raw_arrays
(fields, start_dt, end_dt, sids)[source]¶ - Parameters
fields (list of str) – ‘open’, ‘high’, ‘low’, ‘close’, or ‘volume’
start_dt (Timestamp) – Beginning of the window range.
end_dt (Timestamp) – End of the window range.
sids (list of int) – The asset identifiers in the window.
- Returns
A list with an entry per field of ndarrays with shape (minutes in range, sids) with a dtype of float64, containing the values for the respective field over start and end dt range.
- Return type
list of np.ndarray
-
property
trading_calendar
¶ Returns the zipline.utils.calendar.trading_calendar used to read the data. Can be None (if the writer didn’t specify it).
-
class
zipline.data.bcolz_daily_bars.
BcolzDailyBarReader
(table, read_all_threshold=3000)[source]¶ Reader for raw pricing data written by BcolzDailyOHLCVWriter.
- Parameters
table (bcolz.ctable) – The ctable contaning the pricing data, with attrs corresponding to the Attributes list below.
read_all_threshold (int) – The number of equities at which; below, the data is read by reading a slice from the carray per asset. above, the data is read by pulling all of the data for all assets into memory and then indexing into that array for each day and asset pair. Used to tune performance of reads when using a small or large number of equities.
-
The table with which this loader interacts contains the following
-
attributes
¶
-
We use first_row and last_row together to quickly find ranges of rows to
-
load when reading an asset's data into memory.
-
We use calendar_offset and calendar to orient loaded blocks within a
-
range of queried dates.
Notes
A Bcolz CTable is comprised of Columns and Attributes. The table with which this loader interacts contains the following columns:
[‘open’, ‘high’, ‘low’, ‘close’, ‘volume’, ‘day’, ‘id’].
The data in these columns is interpreted as follows:
Price columns (‘open’, ‘high’, ‘low’, ‘close’) are interpreted as 1000 * as-traded dollar value.
Volume is interpreted as as-traded volume.
Day is interpreted as seconds since midnight UTC, Jan 1, 1970.
Id is the asset id of the row.
The data in each column is grouped by asset and then sorted by day within each asset block.
The table is built to represent a long time range of data, e.g. ten years of equity data, so the lengths of each asset block is not equal to each other. The blocks are clipped to the known start and end date of each asset to cut down on the number of empty values that would need to be included to make a regular/cubic dataset.
When read across the open, high, low, close, and volume with the same index should represent the same asset and day.
-
currency_codes
(sids)[source]¶ Get currencies in which prices are quoted for the requested sids.
Assumes that a sid’s prices are always quoted in a single currency.
- Parameters
sids (np.array[int64]) – Array of sids for which currencies are needed.
- Returns
currency_codes – Array of currency codes for listing currencies of
sids
. Implementations should return None for sids whose currency is unknown.- Return type
np.array[object]
-
get_last_traded_dt
(asset, day)[source]¶ Get the latest minute on or before
dt
in whichasset
traded.If there are no trades on or before
dt
, returnspd.NaT
.- Parameters
asset (zipline.asset.Asset) – The asset for which to get the last traded minute.
dt (pd.Timestamp) – The minute at which to start searching for the last traded minute.
- Returns
last_traded – The dt of the last trade for the given asset, using the input dt as a vantage point.
- Return type
pd.Timestamp
-
get_value
(sid, dt, field)[source]¶ - Parameters
sid (int) – The asset identifier.
day (datetime64-like) – Midnight of the day for which data is requested.
colname (string) – The price field. e.g. (‘open’, ‘high’, ‘low’, ‘close’, ‘volume’)
- Returns
The spot price for colname of the given sid on the given day. Raises a NoDataOnDate exception if the given day and sid is before or after the date range of the equity. Returns -1 if the day is within the date range, but the price is 0.
- Return type
-
property
last_available_dt
¶ returns: dt – The last session for which the reader can provide data. :rtype: pd.Timestamp
-
load_raw_arrays
(columns, start_date, end_date, assets)[source]¶ - Parameters
columns (list of str) – ‘open’, ‘high’, ‘low’, ‘close’, or ‘volume’
start_date (Timestamp) – Beginning of the window range.
end_date (Timestamp) – End of the window range.
assets (list of int) – The asset identifiers in the window.
- Returns
A list with an entry per field of ndarrays with shape (minutes in range, sids) with a dtype of float64, containing the values for the respective field over start and end dt range.
- Return type
list of np.ndarray
-
class
zipline.data.adjustments.
SQLiteAdjustmentReader
(conn)[source]¶ Loads adjustments based on corporate actions from a SQLite database.
Expects data written in the format output by SQLiteAdjustmentWriter.
- Parameters
conn (str or sqlite3.Connection) – Connection from which to load data.
-
load_adjustments
(dates, assets, should_include_splits, should_include_mergers, should_include_dividends, adjustment_type)[source]¶ Load collection of Adjustment objects from underlying adjustments db.
- Parameters
dates (pd.DatetimeIndex) – Dates for which adjustments are needed.
assets (pd.Int64Index) – Assets for which adjustments are needed.
should_include_splits (bool) – Whether split adjustments should be included.
should_include_mergers (bool) – Whether merger adjustments should be included.
should_include_dividends (bool) – Whether dividend adjustments should be included.
adjustment_type (str) – Whether price adjustments, volume adjustments, or both, should be included in the output.
- Returns
adjustments – A dictionary containing price and/or volume adjustment mappings from index to adjustment objects to apply at that index.
- Return type
dict[str -> dict[int -> Adjustment]]
-
unpack_db_to_component_dfs
(convert_dates=False)[source]¶ Returns the set of known tables in the adjustments file in DataFrame form.
- Parameters
convert_dates (bool, optional) – By default, dates are returned in seconds since EPOCH. If convert_dates is True, all ints in date columns will be converted to datetimes.
- Returns
dfs – Dictionary which maps table name to the corresponding DataFrame version of the table, where all date columns have been coerced back from int to datetime.
- Return type
dict{str->DataFrame}
-
class
zipline.assets.
AssetFinder
(engine, future_chain_predicates={'AD': functools.partial(<built-in function delivery_predicate>, {'H', 'M', 'U', 'Z'}), 'BP': functools.partial(<built-in function delivery_predicate>, {'H', 'M', 'U', 'Z'}), 'CD': functools.partial(<built-in function delivery_predicate>, {'H', 'M', 'U', 'Z'}), 'EL': functools.partial(<built-in function delivery_predicate>, {'H', 'M', 'U', 'Z'}), 'GC': functools.partial(<built-in function delivery_predicate>, {'Q', 'G', 'Z', 'M', 'J', 'V'}), 'JY': functools.partial(<built-in function delivery_predicate>, {'H', 'M', 'U', 'Z'}), 'ME': functools.partial(<built-in function delivery_predicate>, {'H', 'M', 'U', 'Z'}), 'PA': functools.partial(<built-in function delivery_predicate>, {'H', 'M', 'U', 'Z'}), 'PL': functools.partial(<built-in function delivery_predicate>, {'J', 'F', 'N', 'V'}), 'SV': functools.partial(<built-in function delivery_predicate>, {'U', 'Z', 'H', 'N', 'K'}), 'XG': functools.partial(<built-in function delivery_predicate>, {'Q', 'G', 'Z', 'M', 'J', 'V'}), 'YS': functools.partial(<built-in function delivery_predicate>, {'U', 'Z', 'H', 'N', 'K'})})[source]¶ An AssetFinder is an interface to a database of Asset metadata written by an
AssetDBWriter
.This class provides methods for looking up assets by unique integer id or by symbol. For historical reasons, we refer to these unique ids as ‘sids’.
- Parameters
engine (str or SQLAlchemy.engine) – An engine with a connection to the asset database to use, or a string that can be parsed by SQLAlchemy as a URI.
future_chain_predicates (dict) – A dict mapping future root symbol to a predicate function which accepts
contract as a parameter and returns whether or not the contract should be (a) –
in the chain. (included) –
See also
-
property
equities_sids
¶ All of the sids for equities in the asset finder.
-
property
futures_sids
¶ All of the sids for futures consracts in the asset finder.
-
get_supplementary_field
(sid, field_name, as_of_date)[source]¶ Get the value of a supplementary field for an asset.
- Parameters
sid (int) – The sid of the asset to query.
field_name (str) – Name of the supplementary field.
as_of_date (pd.Timestamp, None) – The last known value on this date is returned. If None, a value is returned only if we’ve only ever had one value for this sid. If None and we’ve had multiple values, MultipleValuesFoundForSid is raised.
- Raises
NoValueForSid – If we have no values for this asset, or no values was known on this as_of_date.
MultipleValuesFoundForSid – If we have had multiple values for this asset over time, and None was passed for as_of_date.
-
lifetimes
(dates, include_start_date, country_codes)[source]¶ Compute a DataFrame representing asset lifetimes for the specified date range.
- Parameters
dates (pd.DatetimeIndex) – The dates for which to compute lifetimes.
include_start_date (bool) –
Whether or not to count the asset as alive on its start_date.
This is useful in a backtesting context where lifetimes is being used to signify “do I have data for this asset as of the morning of this date?” For many financial metrics, (e.g. daily close), data isn’t available for an asset until the end of the asset’s first day.
country_codes (iterable[str]) – The country codes to get lifetimes for.
- Returns
lifetimes – A frame of dtype bool with dates as index and an Int64Index of assets as columns. The value at lifetimes.loc[date, asset] will be True iff asset existed on date. If include_start_date is False, then lifetimes.loc[date, asset] will be false when date == asset.start_date.
- Return type
pd.DataFrame
See also
numpy.putmask()
,zipline.pipeline.engine.SimplePipelineEngine._compute_root_mask()
-
lookup_generic
(obj, as_of_date, country_code)[source]¶ Convert an object into an Asset or sequence of Assets.
This method exists primarily as a convenience for implementing user-facing APIs that can handle multiple kinds of input. It should not be used for internal code where we already know the expected types of our inputs.
- Parameters
obj (int, str, Asset, ContinuousFuture, or iterable) – The object to be converted into one or more Assets. Integers are interpreted as sids. Strings are interpreted as tickers. Assets and ContinuousFutures are returned unchanged.
as_of_date (pd.Timestamp or None) – Timestamp to use to disambiguate ticker lookups. Has the same semantics as in lookup_symbol.
country_code (str or None) – ISO-3166 country code to use to disambiguate ticker lookups. Has the same semantics as in lookup_symbol.
- Returns
matches, missing –
matches
is the result of the conversion.missing
is a listcontaining any values that couldn’t be resolved. If
obj
is not an iterable,missing
will be an empty list.
- Return type
-
lookup_symbol
(symbol, as_of_date, fuzzy=False, country_code=None)[source]¶ Lookup an equity by symbol.
- Parameters
symbol (str) – The ticker symbol to resolve.
as_of_date (datetime.datetime or None) – Look up the last owner of this symbol as of this datetime. If
as_of_date
is None, then this can only resolve the equity if exactly one equity has ever owned the ticker.fuzzy (bool, optional) – Should fuzzy symbol matching be used? Fuzzy symbol matching attempts to resolve differences in representations for shareclasses. For example, some people may represent the
A
shareclass ofBRK
asBRK.A
, where others could writeBRK_A
.country_code (str or None, optional) – The country to limit searches to. If not provided, the search will span all countries which increases the likelihood of an ambiguous lookup.
- Returns
equity – The equity that held
symbol
on the givenas_of_date
, or the only equity to holdsymbol
ifas_of_date
is None.- Return type
- Raises
SymbolNotFound – Raised when no equity has ever held the given symbol.
MultipleSymbolsFound – Raised when no
as_of_date
is given and more than one equity has heldsymbol
. This is also raised whenfuzzy=True
and there are multiple candidates for the givensymbol
on theas_of_date
. Also raised when nocountry_code
is given and the symbol is ambiguous across multiple countries.
-
lookup_symbols
(symbols, as_of_date, fuzzy=False, country_code=None)[source]¶ Lookup a list of equities by symbol.
Equivalent to:
[finder.lookup_symbol(s, as_of, fuzzy) for s in symbols]
but potentially faster because repeated lookups are memoized.
- Parameters
symbols (sequence[str]) – Sequence of ticker symbols to resolve.
as_of_date (pd.Timestamp) – Forwarded to
lookup_symbol
.fuzzy (bool, optional) – Forwarded to
lookup_symbol
.country_code (str or None, optional) – The country to limit searches to. If not provided, the search will span all countries which increases the likelihood of an ambiguous lookup.
- Returns
equities
- Return type
-
retrieve_all
(sids, default_none=False)[source]¶ Retrieve all assets in sids.
- Parameters
sids (iterable of int) – Assets to retrieve.
default_none (bool) – If True, return None for failed lookups. If False, raise SidsNotFound.
- Returns
assets – A list of the same length as sids containing Assets (or Nones) corresponding to the requested sids.
- Return type
- Raises
SidsNotFound – When a requested sid is not found and default_none=False.
-
retrieve_equities
(sids)[source]¶ Retrieve Equity objects for a list of sids.
Users generally shouldn’t need to this method (instead, they should prefer the more general/friendly retrieve_assets), but it has a documented interface and tests because it’s used upstream.
-
retrieve_futures_contracts
(sids)[source]¶ Retrieve Future objects for an iterable of sids.
Users generally shouldn’t need to this method (instead, they should prefer the more general/friendly retrieve_assets), but it has a documented interface and tests because it’s used upstream.
-
property
sids
¶ All the sids in the asset finder.
-
class
zipline.data.data_portal.
DataPortal
(asset_finder, trading_calendar, first_trading_day, equity_daily_reader=None, equity_minute_reader=None, future_daily_reader=None, future_minute_reader=None, adjustment_reader=None, last_available_session=None, last_available_minute=None, minute_history_prefetch_length=1560, daily_history_prefetch_length=40)[source]¶ Interface to all of the data that a zipline simulation needs.
This is used by the simulation runner to answer questions about the data, like getting the prices of assets on a given day or to service history calls.
- Parameters
asset_finder (zipline.assets.assets.AssetFinder) – The AssetFinder instance used to resolve assets.
trading_calendar (zipline.utils.calendar.exchange_calendar.TradingCalendar) – The calendar instance used to provide minute->session information.
first_trading_day (pd.Timestamp) – The first trading day for the simulation.
equity_daily_reader (BcolzDailyBarReader, optional) – The daily bar reader for equities. This will be used to service daily data backtests or daily history calls in a minute backetest. If a daily bar reader is not provided but a minute bar reader is, the minutes will be rolled up to serve the daily requests.
equity_minute_reader (BcolzMinuteBarReader, optional) – The minute bar reader for equities. This will be used to service minute data backtests or minute history calls. This can be used to serve daily calls if no daily bar reader is provided.
future_daily_reader (BcolzDailyBarReader, optional) – The daily bar ready for futures. This will be used to service daily data backtests or daily history calls in a minute backetest. If a daily bar reader is not provided but a minute bar reader is, the minutes will be rolled up to serve the daily requests.
future_minute_reader (BcolzFutureMinuteBarReader, optional) – The minute bar reader for futures. This will be used to service minute data backtests or minute history calls. This can be used to serve daily calls if no daily bar reader is provided.
adjustment_reader (SQLiteAdjustmentWriter, optional) – The adjustment reader. This is used to apply splits, dividends, and other adjustment data to the raw data from the readers.
last_available_session (pd.Timestamp, optional) – The last session to make available in session-level data.
last_available_minute (pd.Timestamp, optional) – The last minute to make available in minute-level data.
-
get_adjusted_value
(asset, field, dt, perspective_dt, data_frequency, spot_value=None)[source]¶ Returns a scalar value representing the value of the desired asset’s field at the given dt with adjustments applied.
- Parameters
asset (Asset) – The asset whose data is desired.
field ({'open', 'high', 'low', 'close', 'volume', 'price', 'last_traded'}) – The desired field of the asset.
dt (pd.Timestamp) – The timestamp for the desired value.
perspective_dt (pd.Timestamp) – The timestamp from which the data is being viewed back from.
data_frequency (str) – The frequency of the data to query; i.e. whether the data is ‘daily’ or ‘minute’ bars
- Returns
value – The value of the given
field
forasset
atdt
with any adjustments known byperspective_dt
applied. The return type is based on thefield
requested. If the field is one of ‘open’, ‘high’, ‘low’, ‘close’, or ‘price’, the value will be a float. If thefield
is ‘volume’ the value will be a int. If thefield
is ‘last_traded’ the value will be a Timestamp.- Return type
-
get_adjustments
(assets, field, dt, perspective_dt)[source]¶ Returns a list of adjustments between the dt and perspective_dt for the given field and list of assets
- Parameters
assets (list of type Asset, or Asset) – The asset, or assets whose adjustments are desired.
field ({'open', 'high', 'low', 'close', 'volume', 'price', 'last_traded'}) – The desired field of the asset.
dt (pd.Timestamp) – The timestamp for the desired value.
perspective_dt (pd.Timestamp) – The timestamp from which the data is being viewed back from.
- Returns
adjustments – The adjustments to that field.
- Return type
list[Adjustment]
-
get_current_future_chain
(continuous_future, dt)[source]¶ Retrieves the future chain for the contract at the given dt according the continuous_future specification.
-
get_fetcher_assets
(dt)[source]¶ Returns a list of assets for the current date, as defined by the fetcher data.
- Returns
list
- Return type
a list of Asset objects.
-
get_history_window
(assets, end_dt, bar_count, frequency, field, data_frequency, ffill=True)[source]¶ Public API method that returns a dataframe containing the requested history window. Data is fully adjusted.
- Parameters
assets (list of zipline.data.Asset objects) – The assets whose data is desired.
bar_count (int) – The number of bars desired.
frequency (string) – “1d” or “1m”
field (string) – The desired field of the asset.
data_frequency (string) – The frequency of the data to query; i.e. whether the data is ‘daily’ or ‘minute’ bars.
ffill (boolean) – Forward-fill missing values. Only has effect if field is ‘price’.
- Returns
- Return type
A dataframe containing the requested data.
-
get_last_traded_dt
(asset, dt, data_frequency)[source]¶ Given an asset and dt, returns the last traded dt from the viewpoint of the given dt.
If there is a trade on the dt, the answer is dt provided.
-
get_scalar_asset_spot_value
(asset, field, dt, data_frequency)[source]¶ Public API method that returns a scalar value representing the value of the desired asset’s field at either the given dt.
- Parameters
assets (Asset) – The asset or assets whose data is desired. This cannot be an arbitrary AssetConvertible.
field ({'open', 'high', 'low', 'close', 'volume',) – ‘price’, ‘last_traded’} The desired field of the asset.
dt (pd.Timestamp) – The timestamp for the desired value.
data_frequency (str) – The frequency of the data to query; i.e. whether the data is ‘daily’ or ‘minute’ bars
- Returns
value – The spot value of
field
forasset
The return type is based on thefield
requested. If the field is one of ‘open’, ‘high’, ‘low’, ‘close’, or ‘price’, the value will be a float. If thefield
is ‘volume’ the value will be a int. If thefield
is ‘last_traded’ the value will be a Timestamp.- Return type
-
get_spot_value
(assets, field, dt, data_frequency)[source]¶ Public API method that returns a scalar value representing the value of the desired asset’s field at either the given dt.
- Parameters
assets (Asset, ContinuousFuture, or iterable of same.) – The asset or assets whose data is desired.
field ({'open', 'high', 'low', 'close', 'volume',) – ‘price’, ‘last_traded’} The desired field of the asset.
dt (pd.Timestamp) – The timestamp for the desired value.
data_frequency (str) – The frequency of the data to query; i.e. whether the data is ‘daily’ or ‘minute’ bars
- Returns
value – The spot value of
field
forasset
The return type is based on thefield
requested. If the field is one of ‘open’, ‘high’, ‘low’, ‘close’, or ‘price’, the value will be a float. If thefield
is ‘volume’ the value will be a int. If thefield
is ‘last_traded’ the value will be a Timestamp.- Return type
-
get_stock_dividends
(sid, trading_days)[source]¶ Returns all the stock dividends for a specific sid that occur in the given trading range.
- Parameters
sid (int) – The asset whose stock dividends should be returned.
trading_days (pd.DatetimeIndex) – The trading range.
- Returns
list (A list of objects with all relevant attributes populated.)
All timestamp fields are converted to pd.Timestamps.
-
class
zipline.sources.benchmark_source.
BenchmarkSource
(benchmark_asset, trading_calendar, sessions, data_portal, emission_rate='daily', benchmark_returns=None)[source]¶ -
daily_returns
(start, end=None)[source]¶ Returns the daily returns for the given period.
- Parameters
start (datetime) – The inclusive starting session label.
end (datetime, optional) – The inclusive ending session label. If not provided, treat
start
as a scalar key.
- Returns
returns – The returns in the given period. The index will be the trading calendar in the range [start, end]. If just
start
is provided, return the scalar value on that day.- Return type
pd.Series or float
-
get_range
(start_dt, end_dt)[source]¶ Look up the returns for a given period.
- Parameters
start_dt (datetime) – The inclusive start label.
end_dt (datetime) – The inclusive end label.
- Returns
returns – The series of returns.
- Return type
pd.Series
See also
zipline.sources.benchmark_source.BenchmarkSource.daily_returns
()
This method expects minute inputs if
emission_rate == 'minute'
and session labels whenemission_rate == 'daily
.
-
get_value
(dt)[source]¶ Look up the returns for a given dt.
- Parameters
dt (datetime) – The label to look up.
- Returns
returns – The returns at the given dt or session.
- Return type
See also
zipline.sources.benchmark_source.BenchmarkSource.daily_returns
()
This method expects minute inputs if
emission_rate == 'minute'
and session labels whenemission_rate == 'daily
.
-
Bundles¶
-
zipline.data.bundles.
register
(name='__no__default__', f='__no__default__', calendar_name='NYSE', start_session=None, end_session=None, minutes_per_day=390, create_writers=True)¶ Register a data bundle ingest function.
- Parameters
name (str) – The name of the bundle.
f (callable) –
The ingest function. This function will be passed:
- environmapping
The environment this is being run with.
- asset_db_writerAssetDBWriter
The asset db writer to write into.
- minute_bar_writerBcolzMinuteBarWriter
The minute bar writer to write into.
- daily_bar_writerBcolzDailyBarWriter
The daily bar writer to write into.
- adjustment_writerSQLiteAdjustmentWriter
The adjustment db writer to write into.
- calendartrading_calendars.TradingCalendar
The trading calendar to ingest for.
- start_sessionpd.Timestamp
The first session of data to ingest.
- end_sessionpd.Timestamp
The last session of data to ingest.
- cacheDataFrameCache
A mapping object to temporarily store dataframes. This should be used to cache intermediates in case the load fails. This will be automatically cleaned up after a successful load.
- show_progressbool
Show the progress for the current load where possible.
calendar_name (str, optional) – The name of a calendar used to align bundle data. Default is ‘NYSE’.
start_session (pd.Timestamp, optional) – The first session for which we want data. If not provided, or if the date lies outside the range supported by the calendar, the first_session of the calendar is used.
end_session (pd.Timestamp, optional) – The last session for which we want data. If not provided, or if the date lies outside the range supported by the calendar, the last_session of the calendar is used.
minutes_per_day (int, optional) – The number of minutes in each normal trading day.
create_writers (bool, optional) – Should the ingest machinery create the writers for the ingest function. This can be disabled as an optimization for cases where they are not needed, like the
quantopian-quandl
bundle.
Notes
This function my be used as a decorator, for example:
@register('quandl') def quandl_ingest_function(...): ...
See also
-
zipline.data.bundles.
ingest
(name, environ=os.environ, date=None, show_progress=True)¶ Ingest data for a given bundle.
- Parameters
name (str) – The name of the bundle.
environ (mapping, optional) – The environment variables. By default this is os.environ.
timestamp (datetime, optional) – The timestamp to use for the load. By default this is the current time.
assets_versions (Iterable[int], optional) – Versions of the assets db to which to downgrade.
show_progress (bool, optional) – Tell the ingest function to display the progress where possible.
-
zipline.data.bundles.
load
(name, environ=os.environ, date=None)¶ Loads a previously ingested bundle.
- Parameters
name (str) – The name of the bundle.
environ (mapping, optional) – The environment variables. Defaults of os.environ.
timestamp (datetime, optional) – The timestamp of the data to lookup. Defaults to the current time.
- Returns
bundle_data – The raw data readers for this bundle.
- Return type
BundleData
-
zipline.data.bundles.
unregister
(name)¶ Unregister a bundle.
- Parameters
name (str) – The name of the bundle to unregister.
- Raises
UnknownBundle – Raised when no bundle has been registered with the given name.
See also
-
zipline.data.bundles.
bundles
¶ The bundles that have been registered as a mapping from bundle name to bundle data. This mapping is immutable and may only be updated through
register()
orunregister()
.
Risk Metrics¶
Algorithm State¶
-
class
zipline.finance.ledger.
Ledger
(trading_sessions, capital_base, data_frequency)[source]¶ The ledger tracks all orders and transactions as well as the current state of the portfolio and positions.
-
portfolio
¶ The updated portfolio being managed.
-
account
¶ The updated account being managed.
-
position_tracker
¶ The current set of positions.
- Type
-
todays_returns
¶ The current day’s returns. In minute emission mode, this is the partial day’s returns. In daily emission mode, this is
daily_returns[session]
.- Type
-
daily_returns_series
¶ The daily returns series. Days that have not yet finished will hold a value of
np.nan
.- Type
pd.Series
-
daily_returns_array
¶ The daily returns as an ndarray. Days that have not yet finished will hold a value of
np.nan
.- Type
np.ndarray
-
orders
(dt=None)[source]¶ Retrieve the dict-form of all of the orders in a given bar or for the whole simulation.
-
override_account_fields
(settled_cash=sentinel('not_overridden'), accrued_interest=sentinel('not_overridden'), buying_power=sentinel('not_overridden'), equity_with_loan=sentinel('not_overridden'), total_positions_value=sentinel('not_overridden'), total_positions_exposure=sentinel('not_overridden'), regt_equity=sentinel('not_overridden'), regt_margin=sentinel('not_overridden'), initial_margin_requirement=sentinel('not_overridden'), maintenance_margin_requirement=sentinel('not_overridden'), available_funds=sentinel('not_overridden'), excess_liquidity=sentinel('not_overridden'), cushion=sentinel('not_overridden'), day_trades_remaining=sentinel('not_overridden'), leverage=sentinel('not_overridden'), net_leverage=sentinel('not_overridden'), net_liquidation=sentinel('not_overridden'))[source]¶ Override fields on
self.account
.
-
property
portfolio
¶ Compute the current portfolio.
Notes
This is cached, repeated access will not recompute the portfolio until the portfolio may have changed.
-
process_commission
(commission)[source]¶ Process the commission.
- Parameters
commission (zp.Event) – The commission being paid.
-
process_dividends
(next_session, asset_finder, adjustment_reader)[source]¶ Process dividends for the next session.
This will earn us any dividends whose ex-date is the next session as well as paying out any dividends whose pay-date is the next session
-
process_order
(order)[source]¶ Keep track of an order that was placed.
- Parameters
order (zp.Order) – The order to record.
-
process_transaction
(transaction)[source]¶ Add a transaction to ledger, updating the current state as needed.
- Parameters
transaction (zp.Transaction) – The transaction to execute.
-
-
class
zipline.protocol.
Portfolio
(start_date=None, capital_base=0.0)[source]¶ Object providing read-only access to current portfolio state.
- Parameters
start_date (pd.Timestamp) – The start date for the period being recorded.
capital_base (float) – The starting value for the portfolio. This will be used as the starting cash, current cash, and portfolio value.
-
positions
¶ Dict-like object containing information about currently-held positions.
- Type
zipline.protocol.Positions
-
portfolio_value
¶ Current liquidation value of the portfolio’s holdings. This is equal to
cash + sum(shares * price)
- Type
-
property
current_portfolio_weights
¶ Compute each asset’s weight in the portfolio by calculating its held value divided by the total value of all positions.
Each equity’s value is its price times the number of shares held. Each futures contract’s value is its unit price times number of shares held times the multiplier.
-
class
zipline.protocol.
Account
[source]¶ The account object tracks information about the trading account. The values are updated as the algorithm runs and its keys remain unchanged. If connected to a broker, one can update these values with the trading account values as reported by the broker.
-
class
zipline.finance.ledger.
PositionTracker
(data_frequency)[source]¶ The current state of the positions held.
- Parameters
data_frequency ({'daily', 'minute'}) – The data frequency of the simulation.
-
earn_dividends
(cash_dividends, stock_dividends)[source]¶ Given a list of dividends whose ex_dates are all the next trading day, calculate and store the cash and/or stock payments to be paid on each dividend’s pay date.
- Parameters
cash_dividends (iterable of (asset, amount, pay_date) namedtuples) –
stock_dividends (iterable of (asset, payment_asset, ratio, pay_date)) – namedtuples.
-
handle_splits
(splits)[source]¶ Processes a list of splits by modifying any positions as needed.
- Parameters
splits (list) – A list of splits. Each split is a tuple of (asset, ratio).
- Returns
int – position.
- Return type
The leftover cash from fractional shares after modifying each
-
pay_dividends
(next_trading_day)[source]¶ Returns a cash payment based on the dividends that should be paid out according to the accumulated bookkeeping of earned, unpaid, and stock dividends.
-
property
stats
¶ The current status of the positions.
- Returns
stats – The current stats position stats.
- Return type
Notes
This is cached, repeated access will not recompute the stats until the stats may have changed.
-
class
zipline.finance._finance_ext.
PositionStats
¶ Computed values from the current positions.
-
gross_exposure
¶ The gross position exposure.
- Type
float64
-
gross_value
¶ The gross position value.
- Type
float64
-
long_exposure
¶ The exposure of just the long positions.
- Type
float64
-
long_value
¶ The value of just the long positions.
- Type
float64
-
net_exposure
¶ The net position exposure.
- Type
float64
-
net_value
¶ The net position value.
- Type
float64
-
short_exposure
¶ The exposure of just the short positions.
- Type
float64
-
short_value
¶ The value of just the short positions.
- Type
float64
-
longs_count
¶ The number of long positions.
- Type
int64
-
shorts_count
¶ The number of short positions.
- Type
int64
-
position_exposure_array
¶ The exposure of each position in the same order as
position_tracker.positions
.- Type
np.ndarray[float64]
-
position_exposure_series
¶ The exposure of each position in the same order as
position_tracker.positions
. The index is the numeric sid of each asset.- Type
pd.Series[float64]
Notes
position_exposure_array
andposition_exposure_series
share the same underlying memory. The array interface should be preferred if you are doing access each minute for better performance.position_exposure_array
andposition_exposure_series
may be mutated when the position tracker next updates the stats. Do not rely on these objects being preserved across accesses tostats
. If you need to freeze the values, you must take a copy.-
Built-in Metrics¶
-
class
zipline.finance.metrics.metric.
SimpleLedgerField
(ledger_field, packet_field=None)[source]¶ Emit the current value of a ledger field every bar or every session.
-
class
zipline.finance.metrics.metric.
DailyLedgerField
(ledger_field, packet_field=None)[source]¶ Like
SimpleLedgerField
but also puts the current value in thecumulative_perf
section.
-
class
zipline.finance.metrics.metric.
StartOfPeriodLedgerField
(ledger_field, packet_field=None)[source]¶ Keep track of the value of a ledger field at the start of the period.
-
class
zipline.finance.metrics.metric.
StartOfPeriodLedgerField
(ledger_field, packet_field=None)[source]¶ Keep track of the value of a ledger field at the start of the period.
-
class
zipline.finance.metrics.metric.
Returns
[source]¶ Tracks the daily and cumulative returns of the algorithm.
-
class
zipline.finance.metrics.metric.
BenchmarkReturnsAndVolatility
[source]¶ Tracks daily and cumulative returns for the benchmark as well as the volatility of the benchmark returns.
-
class
zipline.finance.metrics.metric.
CashFlow
[source]¶ Tracks daily and cumulative cash flow.
Notes
For historical reasons, this field is named ‘capital_used’ in the packets.
-
class
zipline.finance.metrics.metric.
ReturnsStatistic
(function, field_name=None)[source]¶ A metric that reports an end of simulation scalar or time series computed from the algorithm returns.
- Parameters
function (callable) – The function to call on the daily returns.
field_name (str, optional) – The name of the field. If not provided, it will be
function.__name__
.
Metrics Sets¶
-
zipline.finance.metrics.
register
(name, function=None)¶ Register a new metrics set.
- Parameters
name (str) – The name of the metrics set
function (callable) – The callable which produces the metrics set.
Notes
This may be used as a decorator if only
name
is passed.See also
zipline.finance.metrics.get_metrics_set()
,zipline.finance.metrics.unregister_metrics_set()
-
zipline.finance.metrics.
load
(name)¶ Return an instance of the metrics set registered with the given name.
- Returns
metrics – A new instance of the metrics set.
- Return type
set[Metric]
- Raises
ValueError – Raised when no metrics set is registered to
name
-
zipline.finance.metrics.
unregister
(name)¶ Unregister an existing metrics set.
- Parameters
name (str) – The name of the metrics set
See also
zipline.finance.metrics.register_metrics_set()
-
zipline.data.finance.metrics.
metrics_sets
¶ The metrics sets that have been registered as a mapping from metrics set name to load function. This mapping is immutable and may only be updated through
register()
orunregister()
.
Utilities¶
Caching¶
-
class
zipline.utils.cache.
CachedObject
(value, expires)[source]¶ A simple struct for maintaining a cached object with an expiration date.
- Parameters
value (object) – The object to cache.
expires (datetime-like) – Expiration date of value. The cache is considered invalid for dates strictly greater than expires.
Examples
>>> from pandas import Timestamp, Timedelta >>> expires = Timestamp('2014', tz='UTC') >>> obj = CachedObject(1, expires) >>> obj.unwrap(expires - Timedelta('1 minute')) 1 >>> obj.unwrap(expires) 1 >>> obj.unwrap(expires + Timedelta('1 minute')) ... Traceback (most recent call last): ... Expired: 2014-01-01 00:00:00+00:00
-
class
zipline.utils.cache.
ExpiringCache
(cache=None, cleanup=<function ExpiringCache.<lambda>>)[source]¶ A cache of multiple CachedObjects, which returns the wrapped the value or raises and deletes the CachedObject if the value has expired.
- Parameters
cache (dict-like, optional) – An instance of a dict-like object which needs to support at least: __del__, __getitem__, __setitem__ If None, than a dict is used as a default.
cleanup (callable, optional) – A method that takes a single argument, a cached object, and is called upon expiry of the cached object, prior to deleting the object. If not provided, defaults to a no-op.
Examples
>>> from pandas import Timestamp, Timedelta >>> expires = Timestamp('2014', tz='UTC') >>> value = 1 >>> cache = ExpiringCache() >>> cache.set('foo', value, expires) >>> cache.get('foo', expires - Timedelta('1 minute')) 1 >>> cache.get('foo', expires + Timedelta('1 minute')) Traceback (most recent call last): ... KeyError: 'foo'
-
class
zipline.utils.cache.
dataframe_cache
(path=None, lock=None, clean_on_failure=True, serialization='msgpack')[source]¶ A disk-backed cache for dataframes.
dataframe_cache
is a mutable mapping from string names to pandas DataFrame objects. This object may be used as a context manager to delete the cache directory on exit.- Parameters
path (str, optional) – The directory path to the cache. Files will be written as
path/<keyname>
.lock (Lock, optional) – Thread lock for multithreaded/multiprocessed access to the cache. If not provided no locking will be used.
clean_on_failure (bool, optional) – Should the directory be cleaned up if an exception is raised in the context manager.
serialize ({'msgpack', 'pickle:<n>'}, optional) – How should the data be serialized. If
'pickle'
is passed, an optional pickle protocol can be passed like:'pickle:3'
which says to use pickle protocol 3.
Notes
The syntax
cache[:]
will load all key:value pairs into memory as a dictionary. The cache uses a temporary file format that is subject to change between versions of zipline.
-
class
zipline.utils.cache.
working_file
(final_path, *args, **kwargs)[source]¶ A context manager for managing a temporary file that will be moved to a non-temporary location if no exceptions are raised in the context.
- Parameters
final_path (str) – The location to move the file when committing.
*args – Forwarded to NamedTemporaryFile.
**kwargs – Forwarded to NamedTemporaryFile.
Notes
The file is moved on __exit__ if there are no exceptions.
working_file
usesshutil.move()
to move the actual files, meaning it has as strong of guarantees asshutil.move()
.
-
class
zipline.utils.cache.
working_dir
(final_path, *args, **kwargs)[source]¶ A context manager for managing a temporary directory that will be moved to a non-temporary location if no exceptions are raised in the context.
- Parameters
final_path (str) – The location to move the file when committing.
*args – Forwarded to tmp_dir.
**kwargs – Forwarded to tmp_dir.
Notes
The file is moved on __exit__ if there are no exceptions.
working_dir
usesdir_util.copy_tree()
to move the actual files, meaning it has as strong of guarantees asdir_util.copy_tree()
.
Command Line¶
-
zipline.utils.cli.
maybe_show_progress
(it, show_progress, **kwargs)[source]¶ Optionally show a progress bar for the given iterator.
- Parameters
it (iterable) – The underlying iterator.
show_progress (bool) – Should progress be shown.
**kwargs – Forwarded to the click progress bar.
- Returns
itercontext – A context manager whose enter is the actual iterator to use.
- Return type
context manager
Examples
with maybe_show_progress([1, 2, 3], True) as ns: for n in ns: ...